InterviewBee — Android Developer
Question Bank
Question 1: Architecture — Designing a Scalable Android App with Clean Architecture and Jetpack Compose
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Spotify, Airbnb, Lyft, Square
The Question
You are a Senior Android Developer joining a team that has a 4-year-old Android app with 2 million daily active users. The codebase is a single-module monolith written in Java, using an MVP pattern with Activities as presenters, no dependency injection, and SharedPreferences for all local storage. The app has grown to 180,000 lines of code. Build times take 14 minutes. The team has 8 developers and ships weekly, but feature development has slowed significantly — adding a new screen takes an average of 3 days due to the tangled dependency graph. The Head of Engineering has approved a full architectural migration to Clean Architecture with Kotlin, Jetpack Compose, Hilt, and Room. You have been asked to design the migration strategy, the target architecture, and how you handle the transition without interrupting weekly releases. Walk through your architectural design decisions, the module structure, and the migration sequencing.
1. What Is This Question Testing?
- Clean Architecture for Android — understanding the three-layer Clean Architecture model in an Android context: the data layer (repositories, data sources — Room DAOs, Retrofit API services, remote data sources), the domain layer (use cases / interactors — pure Kotlin with no Android framework dependencies, the single-most-important architectural decision that makes the business logic testable without an emulator), and the presentation layer (ViewModels that hold UI state, Compose UI that renders state and emits events); knowing that the dependency rule (outer layers depend on inner layers, never the reverse) is what makes the architecture testable and maintainable
- Modularisation strategy — understanding that splitting a 180,000-line monolith into modules reduces build times because Gradle can cache and incrementally rebuild only changed modules; knowing the modularisation approaches: by layer (a
:data,:domain,:presentationmodule), by feature (:feature:home,:feature:search,:feature:profile), or hybrid (feature modules each containing their own data/domain/presentation); for an 8-developer team working on multiple features simultaneously, feature modularisation with a shared:coremodule produces the best combination of build time reduction and team autonomy
- The strangler fig migration pattern — understanding that a big-bang rewrite of a 180,000-line production codebase is high-risk; the strangler fig pattern (introduce the new architecture incrementally, one feature at a time, while the old architecture continues to operate around it) allows weekly releases to continue throughout the migration; knowing specifically how to implement this for Android: new features are built in new feature modules using the new architecture; old features are migrated one at a time, starting with the most isolated
- Jetpack Compose and the ViewModel contract — understanding that in a Compose UI, the ViewModel exposes UI state as a
StateFlow<UiState>(orStateFlow<List<T>>for simpler cases) which Compose collects withcollectAsStateWithLifecycle(); the ViewModel receives UI events from Compose via functions; the ViewModel never holds a reference to the Compose UI — this one-directional data flow is what makes Compose screens independently testable
- Dependency injection with Hilt — knowing that Hilt is the official Android DI framework built on top of Dagger; understanding the Hilt component hierarchy (SingletonComponent for application-scoped dependencies like Retrofit and Room, ViewModelComponent for ViewModel-scoped dependencies like use cases, ActivityComponent for Activity-scoped dependencies); and knowing how to scope dependencies correctly to avoid memory leaks (a Repository should be singleton-scoped, not ViewModel-scoped)
- Data layer architecture — understanding that a Repository is the single source of truth for a data domain; it abstracts over multiple data sources (a
UserRepositorythat delegates toUserRemoteDataSourcefor API calls andUserLocalDataSourcefor Room caching); the repository pattern with Room as the single source of truth and Retrofit as the remote data source (with a "fetch from network, write to Room, always read from Room" pattern) is the standard Android offline-first architecture
2. Framework: Android Clean Architecture Migration Model (ACAMM)
- Assumption Documentation — Establish the current architecture's pain points quantitatively: which Activities have the most dependencies? Which features take the longest to build and test? What is the current unit test coverage? (Likely near-zero given the MVP-in-Activity pattern.) These measurements establish the migration priority — start migrating the features that are most actively developed, not the oldest or most complex
- Constraint Analysis — Weekly releases must continue throughout the migration (no release freeze); Java-to-Kotlin migration can be incremental (Kotlin is fully interoperable with Java — new files can be Kotlin while existing files remain Java); 8 developers working simultaneously means the module structure must be designed to minimise merge conflicts
- Tradeoff Evaluation — Layer modularisation (
:core:data,:core:domain,:core:presentation) — simpler structure, faster to establish, but creates a monolith within each layer; feature modularisation (:feature:home,:feature:search) — more upfront design, but allows teams to work on features independently with fewer merge conflicts; hybrid (features own their own layers, sharing a:coremodule for cross-cutting concerns) — the recommended approach for an 8-developer team
- Hidden Cost Identification — The
appmodule must remain thin (it is the entry point — it wires up navigation and initialises Hilt); a common mistake is accumulating non-feature code in:appbecause it is the only module that "sees everything"; the migration plan must explicitly prevent:appfrom becoming a new monolith
- Risk Signals / Early Warning Metrics — Build time trend (should decrease with each new feature module added, as Gradle caches module outputs independently), lines of code in the
:appmodule (should trend toward zero as features migrate), unit test coverage percentage per new feature module (should be measurably higher than the legacy code — target above 70% for domain and ViewModel layers)
- Pivot Triggers — If after 3 months of migration, the build time has not decreased despite adding feature modules: the module dependency graph has a cycle (Module A depends on Module B which depends on Module A) that is preventing Gradle from caching modules independently; run
./gradlew :app:dependenciesto visualise the dependency tree and identify the cycle
- Long-Term Evolution Plan — Months 1–3: core infrastructure (Hilt setup, Room database, Retrofit client, base ViewModel structure); Months 4–8: migrate the 5 highest-traffic features to feature modules; Months 9–12: complete migration of remaining features; Year 2: baseline profiling, build time under 3 minutes, full Compose migration
3. The Answer
Explicit Assumptions:
- The app's primary features: home feed, product search, product detail, cart, checkout, user profile (6 major feature areas, each currently implemented as 3–5 Activities)
- Minimum Android version: API 24 (Android 7.0) — Jetpack Compose supports API 21+; all planned libraries are compatible
- Gradle build system: Gradle 8.x with the Kotlin DSL for build scripts; currently uses Groovy DSL which will be migrated incrementally
- CI pipeline: GitHub Actions with an automated build and test on every PR
The Target Module Structure
The target architecture uses a hybrid modularisation pattern with 4 module types. :app module (the entry point): contains MainActivity, NavHost (the Compose navigation graph), and Hilt application-level setup; no business logic; no UI screens; this module compiles in under 30 seconds. :core:* modules (shared infrastructure): :core:network (Retrofit client, OkHttp interceptors, network models), :core:database (Room database, shared DAOs for cross-feature entities like User), :core:design (the Compose design system — typography, colour tokens, common components like AppButton, AppTextField, AppTopBar), :core:common (shared utility functions, extension functions, coroutine dispatchers, Result<T> sealed class). :feature:* modules (one per product feature): each feature module contains its own data/ (feature-specific repositories and data sources), domain/ (feature-specific use cases), and ui/ (Compose screens and ViewModels); feature modules can depend on :core:* modules but must never depend on other feature modules (if two features share data, the shared data belongs in :core:database). :domain:* modules (optional shared domain): if multiple feature modules share domain logic (e.g., GetCurrentUserUseCase is used by both feature:home and feature:profile), extract it to a :domain:user module.
The Target Architecture Within Each Feature Module
Using the feature:search module as the canonical example. Data layer: SearchRemoteDataSource (Retrofit service calls → returns NetworkSearchResult), SearchLocalDataSource (Room DAO for cached search results), SearchRepository (the single source of truth — fetches from remote, writes to local, emits Flow<List<SearchResult>> from the local database; the ViewModel always reads from Room, never directly from the network). Domain layer: SearchProductsUseCase(searchRepository: SearchRepository) — accepts a SearchQuery value class and returns Flow<Result<List<Product>>>; this use case is pure Kotlin with no Android imports — it can be tested with plain JUnit without a Robolectric or emulator. Presentation layer: SearchViewModel(searchProductsUseCase: SearchProductsUseCase) — holds private val _uiState = MutableStateFlow<SearchUiState>(SearchUiState.Empty); exposes val uiState: StateFlow<SearchUiState> = _uiState.asStateFlow(); handles UI events via fun onSearchQueryChanged(query: String) and fun onSearchSubmitted(); all business logic in the ViewModel is delegated to the use case. UI layer: SearchScreen(viewModel: SearchViewModel = hiltViewModel()) — a Composable that calls val uiState by viewModel.uiState.collectAsStateWithLifecycle(); renders based on the sealed SearchUiState; emits events back to the ViewModel via lambdas passed to child Composables.
The Strangler Fig Migration: How to Migrate Without Stopping Releases
The migration runs in parallel with feature development using a "new features in new architecture, migrate old features opportunistically" strategy. Phase 1 (Month 1–2): set up the new architecture without removing any old code. Add Hilt to the project (this is purely additive — existing Activities and their dependencies are unaffected). Add the :core:network module with the new Retrofit setup. Add the :core:database module with Room. Add the :core:design module with the first Compose components. The app still builds and runs identically to before. Phase 2 (Month 3–4): build the first new feature in the new architecture. The next feature request (e.g., a redesigned search experience) is built entirely in the new :feature:search module. The old SearchActivity continues to exist in the :app module. The new SearchScreen Composable is reached via a new navigation route in the NavHost. Feature flag: a FeatureFlag.NEW_SEARCH_ENABLED boolean determines whether clicking "Search" navigates to the old SearchActivity or the new NavHost route. Once the new search is production-validated, the feature flag is flipped to 100%, the old SearchActivity is deleted, and the search-related code is removed from :app. Phase 3 (Months 5–12): repeat the feature flag pattern for each remaining feature area. Migration order is determined by development velocity (features currently under active development are migrated first) and isolation (features with the fewest dependencies on other features are easiest to migrate first).
Handling the SharedPreferences to Room Migration
SharedPreferences to DataStore (for primitive key-value storage) and Room (for structured data) is a data migration challenge. The migration strategy: (1) Add a LegacyPreferencesMigration class that reads from SharedPreferences and writes to Room/DataStore on first launch (a one-time migration). (2) New code reads exclusively from Room/DataStore. (3) Old code continues to read from SharedPreferences until the old screen is migrated. (4) After the migration of the final screen that depends on the legacy preference key, delete the legacy key and its migration code. This is a zero-downtime migration — users on old app versions use SharedPreferences; users who update migrate transparently on first launch.
Build Time Improvement Projection
Before migration: 14-minute monolithic build (every change triggers a full recompile). After migration (target state): :core:* modules compile in 45–90 seconds each and are cached by Gradle; :feature:* modules compile in 60–120 seconds each and are cached independently; a change in feature:search triggers only feature:search + app recompilation = approximately 2–3 minutes. Incremental build for the common case (a developer changes code in one feature module): under 90 seconds. Full build (first build or when :core changes): approximately 4–5 minutes. Target: full builds under 5 minutes, incremental builds under 2 minutes.
Early Warning Metrics:
- Lines of Java remaining in the codebase (monthly) — the metric that tracks Java-to-Kotlin migration progress; should decrease monotonically; a plateau indicates the team is not prioritising migration alongside feature work
- Module dependency graph complexity score — computed by counting the number of inter-module dependencies; a score that increases over time indicates that feature modules are beginning to depend on each other (violating the architecture) and must be refactored
- Domain layer unit test coverage — measured per feature module using the
koverplugin; target above 70% for new feature modules; use cases and ViewModels are the primary test targets; a module below 50% coverage is a technical debt risk item for the next sprint
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The strangler fig migration pattern with feature flags — specifically the detail that the old SearchActivity and the new SearchScreen coexist with a runtime flag determining which is served, allowing the new architecture to be production-validated before the old code is deleted — is the production engineering judgment that distinguishes a senior engineer from one who would propose a risky big-bang rewrite. The :app module discipline (keeping it thin, containing only the NavHost and Hilt application setup) is the architectural constraint that prevents the new structure from gradually re-accumulating the same monolith problem. The one-directional data flow contract (Repository → Room → Flow → ViewModel → StateFlow → Compose) with the explicit rule that the ViewModel never reads directly from the network is the specific design decision that makes the architecture testable.
What differentiates it from mid-level thinking: A mid-level Android developer would propose "migrate to MVVM and use Jetpack Compose" without specifying the module structure, without addressing the strangler fig migration strategy for a team that must continue releasing weekly, without explaining the Hilt component hierarchy and scoping rules, and without providing the build time improvement projection with Gradle caching.
What would make it a 10/10: A 10/10 response would include the specific Gradle module dependency configuration in the build.gradle.kts for a feature module (showing the implementation/api distinction for core dependencies), a concrete SearchUiState sealed class definition with all the relevant states (Loading, Success, Empty, Error), and a complete Hilt module showing the dependency bindings for the Repository and its data sources.
Question 2: Performance — Diagnosing and Resolving Android App Performance Problems
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Instagram, TikTok, Netflix, LinkedIn
The Question
You are a Senior Android Developer at a social media company. The Android app has 5 million daily active users and has received a wave of user complaints and 1-star reviews in the past month: "the feed scrolls choppily," "the app takes 8 seconds to open," and "watching videos causes the phone to get hot and the battery drains fast." Android Vitals in the Google Play Console shows: app startup time p75 = 7.8 seconds (Google's "bad" threshold is 5 seconds for cold start), ANR rate = 0.48% (threshold for "bad" is 0.47%), jank rate on the main feed screen = 18% of frames rendered over 16ms. You have Android Studio Profiler, Perfetto, and Firebase Performance Monitoring available. Walk through your systematic investigation of each of the three performance problems — startup time, ANRs, and jank — the specific tools and metrics you would use, the most likely root causes in a social media app context, and the specific fixes.
1. What Is This Question Testing?
- App startup optimisation — knowing the three startup types: cold start (app process not in memory — the most expensive), warm start (process exists but Activity is destroyed), and hot start (process and Activity exist but are backgrounded); knowing that 7.8 seconds cold start indicates significant work happening before the first frame is displayed; knowing the tools: Android Studio App Startup profiler trace, Perfetto system trace with
atracecategories, and thereportFullyDrawn()API; and knowing the common cold start culprits in a social media app: initialising too many libraries synchronously on the main thread duringApplication.onCreate(), loading large assets before the launch screen is dismissed
- ANR diagnosis — knowing that ANR (Application Not Responding) occurs when the main thread is blocked for more than 5 seconds during an event response or 10 seconds during a broadcast; knowing that the Play Console provides ANR traces showing the main thread's stack at the moment of the ANR; knowing the common causes: synchronous network calls on the main thread, database queries on the main thread, holding a lock that another thread holds; and knowing the solution: use coroutines with
Dispatchers.IOfor any blocking operation
- Jank and rendering performance — understanding the 60fps target (16ms per frame) and the rendering pipeline (CPU measure/layout/draw → GPU compositing → display); knowing that jank (frames over 16ms) has two root causes: CPU overdraw in the measure/layout phase (too many nested ViewGroups, expensive
onDraw()calls) and GPU overdraw (drawing the same pixel multiple times); knowing the tools: Systrace/Perfetto for frame timing, the GPU Overdraw visualiser in Developer Options, Android Studio Layout Inspector for view hierarchy depth, and theandroid.graphics.HardwareRendererObserverAPI for production jank monitoring
- Memory and thermal issues — knowing that video playback causing the phone to heat up indicates CPU/GPU over-utilisation; the
ExoPlayerconfiguration must use hardware-accelerated decoding (MediaCodec), not software decoding; excessive thread creation for video loading is a common cause; the Memory Profiler in Android Studio reveals memory leaks (objects that should be garbage collected but remain in memory because a static reference holds them)
- Firebase Performance Monitoring for production data — knowing that Android Studio Profiler reveals what happens on a developer's device; Firebase Performance Monitoring reveals what happens across the production fleet; custom traces (
Trace trace = FirebasePerformance.getInstance().newTrace("feed_load_time"); trace.start(); ... trace.stop()) allow measuring specific operations in production; knowing how to use this data to prioritise fixes (a slow operation that affects 20% of users at p95 is higher priority than one that affects 0.1% of users at p99)
- The
StrictModedeveloper tool — knowing that enablingStrictModein debug builds detects disk reads/writes and network calls on the main thread at development time, before they reach production; a team that ships an ANR-causing main thread database read does not haveStrictModeenabled
2. Framework: Android Performance Investigation Model (APIM)
- Assumption Documentation — Before profiling, gather baseline data: which device tier generates the most ANRs (low-end devices with 2GB RAM are the most common ANR generators for apps that assume 6GB+), which network conditions correlate with startup time (a 7.8-second cold start may be driven by a synchronous network call on a slow network), and what percentage of the jank comes from the video feed vs. the text feed (isolating the component narrows the investigation)
- Constraint Analysis — 5 million DAU means the investigation must use production data (Firebase Performance Monitoring) to reproduce the exact conditions that users experience; a developer's high-end Pixel 8 Pro will not reproduce the ANRs that occur on a 2GB RAM Realme C11
- Tradeoff Evaluation — Fix the ANR first (it is at the "bad" threshold in Android Vitals — Google's Play Store ranking algorithm penalises apps above the ANR threshold), then the startup time (7.8 seconds is above the "bad" threshold and affects first-run user retention), then jank (important for user satisfaction but not currently triggering a Play Store ranking penalty)
- Hidden Cost Identification — Jank fixes that involve changing view hierarchy depth or custom
onDraw()methods risk introducing new layout bugs; performance fixes in the rendering pipeline must be regression-tested with UI tests (Espresso or Compose test rules) before release; a jank fix that introduces a layout regression is a net negative
- Risk Signals / Early Warning Metrics — Android Vitals ANR rate trend (alert if ANR rate exceeds 0.47% in any release — this is the Play Store "bad" threshold that affects app ranking), app startup time p75 trend (Google Play measures this automatically; alert if p75 exceeds 5 seconds for cold start), frame rendering time (measured in production via
FrameMetricsAggregatoror Firebase Performance custom traces — alert if more than 15% of frames take over 16ms in the main feed)
- Pivot Triggers — If the ANR traces from the Play Console all point to the same thread (a background thread holding a lock on a shared resource that the main thread is waiting for): the ANR root cause is a mutex contention issue, not a main thread operation issue; the fix is lock-free data structures or replacing the shared mutable state with a
StateFlow
- Long-Term Evolution Plan — Fix the immediate issues (4–6 weeks); establish a performance regression budget (no release increases any of the 3 key metrics by more than 5%); implement baseline profiles (Jetpack
BaselineProfileto pre-compile the app's critical code paths); add macrobenchmark tests to the CI pipeline to catch performance regressions before release
3. The Answer
Explicit Assumptions:
- The app uses Jetpack Compose for the feed UI and a custom
RecyclerViewadapter for the video feed (in a transition state between the legacy and new UI)
ExoPlayer(Media3) is used for video playback
- The app initialises 12 third-party SDKs in
Application.onCreate()
- The CI pipeline does not include any performance regression tests
Problem 1: Cold Start Time (7.8 seconds) — Investigation and Fix
Investigation: open Android Studio, connect a mid-range test device (not the developer's flagship phone), and run the App Startup profiler (Build → Profile → "App Startup"). The timeline shows all tasks executed before the first frame is displayed. The trace reveals: Application.onCreate() takes 4.1 seconds — the majority of the 7.8-second startup. Within Application.onCreate(), 12 SDK initialisation calls are stacked sequentially: Firebase (0.2s), Amplitude (0.4s), Crashlytics (0.1s), and 9 others including a proprietary video SDK initialisation that takes 1.4 seconds. Root cause: all 12 SDKs are initialised synchronously on the main thread in Application.onCreate(). The video SDK performs a disk read during initialisation (loading a configuration file), which is also visible in the Perfetto trace as a long I/O wait on the main thread. Fix: use the Jetpack App Startup library to move non-critical SDK initialisations to a background thread and defer them until they are first needed. Classify the 12 SDKs into 3 groups: critical at startup (Firebase Remote Config — needed before the first screen renders), critical but parallelisable (Amplitude, Crashlytics — must initialise before the first user action but can run on a background thread in parallel with the main thread rendering the launch screen), and deferrable (the video SDK — can be initialised when the user first scrolls to a video, not at app start). Implement with androidx.startup.Initializer for the critical SDKs and lifecycleScope.launch(Dispatchers.Default) in Application.onCreate() for the non-critical parallel initialisations. Expected startup time after fix: 4.1s initialisation → 0.8s for the critical-only sequential path + background parallel initialisations complete before the user reaches the feed = p75 cold start time approximately 2.1 seconds.
Problem 2: ANR Rate (0.48%) — Investigation and Fix
Investigation: download the ANR traces from the Google Play Console (Android Vitals → ANRs → download the top ANR cluster). The trace shows the main thread's call stack at the moment of the ANR. For a social media app, the most common ANR traces show one of: the main thread executing SharedPreferences.commit() (a synchronous disk write on the main thread), a Retrofit execute() call (synchronous network on the main thread), or a Room query without a suspend modifier being called from a non-coroutine context. The trace from Play Console reveals: SharedPreferences.commit() → UserPreferencesManager.saveUserSettings() → ProfileViewModel.onSettingChanged() → ProfileFragment.onSwitchToggled(). This is a synchronous disk write triggered by a UI interaction — a classic ANR cause. Root cause: SharedPreferences.commit() is a synchronous disk write; apply() is asynchronous; the code used commit() incorrectly. Secondary fix: migrate from SharedPreferences to DataStore (Jetpack DataStore uses Flow and coroutines — all writes are inherently asynchronous and off the main thread). Immediate fix (1 day): replace SharedPreferences.commit() with SharedPreferences.apply() across all usages; use StrictMode.setThreadPolicy(StrictMode.ThreadPolicy.Builder().detectDiskWrites().penaltyLog().build()) in debug builds to detect any remaining disk writes on the main thread. Medium-term fix (1 sprint): migrate UserPreferences to DataStore<Preferences> with coroutine-based writes: suspend fun saveUserSettings(settings: UserSettings) { dataStore.edit { it[USER_SETTINGS_KEY] = settings.toJson() } }. Expected ANR rate after fix: 0.08–0.12% (the SharedPreferences.commit() cluster is typically responsible for 70–80% of ANRs in apps that use it for settings persistence).
Problem 3: Jank (18% frames over 16ms) — Investigation and Fix
Investigation: open the Perfetto system trace profiler during a manual scroll through the feed. The flame chart shows the CPU time spent per frame. For a Compose + RecyclerView hybrid feed, jank has two common sources. Source A — Recomposition storms in Compose: the feed uses a LazyColumn inside a RecyclerView's ViewHolder, which means Compose recompositions are triggered every time the RecyclerView binds a new ViewHolder even for items that haven't changed. Use the Layout Inspector's "Recomposition highlights" mode (available in Android Studio Flamingo+) to identify which Composables are recomposing on every frame. The FeedItemCard Composable is recomposing on every scroll frame even when its feedItem argument has not changed. Root cause: feedItem is a data class but its equals() method is not being evaluated correctly by Compose because one of its fields is a List<String> (non-stable type). Fix: annotate the FeedItem data class with @Stable (explicitly telling Compose that this class is stable and Compose can skip recomposition if its values haven't changed), or switch to using Kotlin's @Immutable if FeedItem is truly immutable after construction. Source B — GPU overdraw in the video thumbnail loading: the RecyclerView displays video thumbnails loaded via Coil (the Kotlin image loading library). The GPU Overdraw visualiser (Developer Options → Debug GPU overdraw) shows 3-layer overdraw on every feed item (the background of the RecyclerView, the background of the ViewHolder, and the image background — all drawing the same pixels). Fix: set the RecyclerView's background to @null in XML (it inherits the window background; there is no need for a separate background); remove the explicit background from the ViewHolder's root view. Expected jank rate after both fixes: 18% → approximately 4–6% of frames over 16ms (below the threshold for user-perceived jank on mid-range devices).
Video Playback Thermal and Battery Fix
The phone heating and battery drain during video playback indicates software video decoding. Verify: open the MediaCodec diagnostics with adb shell dumpsys media.player during video playback; look for android.media.video.sw-decoder in the output (software decoding) rather than android.media.video.hw-decoder (hardware decoding). Root cause: ExoPlayer's DefaultRenderersFactory is falling back to software decoding because the MediaCodec hardware decoder is not being granted the correct MediaDrm configuration for the video codec in use. Fix: explicitly configure ExoPlayer to prefer hardware decoding: DefaultRenderersFactory(context).setExtensionRendererMode(DefaultRenderersFactory.EXTENSION_RENDERER_MODE_OFF) (disables FFmpeg software decoder extension), and set player.setVideoDecoderOutputBufferRenderer(hardwareRenderer). Additionally: limit background video preloading to 1 video ahead (the current implementation preloads 3 videos ahead, causing 3 decoders to initialise simultaneously — the thermal spike occurs during this multi-decoder initialisation period).
Early Warning Metrics:
FrameMetricsAggregatorcustom Firebase trace for the main feed — reports the percentage of frames over 16ms per build version; alert if any build version shows more than 10% of frames over 16ms in production
- ANR cluster velocity in Android Vitals — monitor the rate of new ANR signatures appearing per release; a new ANR signature in a release means a new main-thread blocking operation was introduced and should trigger an immediate investigation before the next release
- App startup time p75 in Firebase Performance Monitoring, by device tier — segment cold start time by device tier (flagship, mid-range, low-end); ensure p75 cold start is below 3 seconds for flagship and below 5 seconds for mid-range; a new SDK initialisation that slips through baseline profiling will appear as a step-change increase in this metric
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The specific ANR trace diagnosis path — identifying SharedPreferences.commit() as the root cause from the Play Console call stack, distinguishing it from the secondary fix (DataStore migration), and providing the immediate 1-day fix alongside the medium-term architectural fix — is the triage discipline that prioritises business impact over engineering elegance. The Compose recomposition jank diagnosis using the @Stable annotation as the fix (rather than a generic "optimise the list") shows deep Compose rendering knowledge. The ExoPlayer hardware decoding diagnosis using adb shell dumpsys media.player is the specific debugging command that a developer who has actually debugged video playback issues on Android knows, rather than a generic "check ExoPlayer configuration" answer.
What differentiates it from mid-level thinking: A mid-level Android developer would suggest "profile with Android Studio Profiler" for all three problems without knowing which specific profiler view to use for each problem type, would not know about the StrictMode main-thread detection tool, would not identify the @Stable/@Immutable Compose annotation as the recomposition jank root cause, and would not know how to diagnose software vs. hardware video decoding via adb shell dumpsys.
What would make it a 10/10: A 10/10 response would include the specific Perfetto trace configuration commands (adb shell perfetto -t 10s -b 32mb -o /data/misc/perfetto-traces/trace.perfetto-trace), a concrete App Startup library Initializer implementation for the parallel SDK initialisation, and a BaselineProfile configuration showing how to pre-compile the critical code paths for the feed screen.
Question 3: Testing — Designing a Comprehensive Android Testing Strategy
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Uber, Airbnb, Dropbox, Slack
The Question
You are a Senior Android Developer at a fintech company. The Android app handles payments, account management, and financial reporting. The current test suite has 23% unit test coverage, no integration tests, and 12 UI tests that run on a physical device farm and take 45 minutes to complete. The development team of 10 engineers has a "test debt" mentality — tests are written after bugs are reported, not before or during feature development. There have been 3 production incidents in the past quarter caused by Android-side issues: a database migration that corrupted data for users upgrading from version 4.x to 5.x, a ViewModel that leaked a LiveData observer and caused crashes for users who navigated back and forth rapidly, and a payment confirmation screen that submitted the payment twice when the submit button was tapped during a slow network response. Walk through the testing strategy you would implement, how you would address each of the three historical incidents with specific test types, and how you would change the team's culture toward test-first development.
1. What Is This Question Testing?
- The Android testing pyramid — understanding the three-level Android testing pyramid: unit tests (fast, run on the JVM, test individual classes in isolation — ViewModels, use cases, repositories with mocked dependencies), integration tests (run on the device/emulator, test interactions between layers — Room database with migration tests, DataStore, coroutine flow integration), and end-to-end UI tests (run on device/emulator, test complete user flows — Espresso or Compose UI test); knowing that the current 23% unit coverage with 12 E2E tests is an inverted pyramid (too many expensive E2E tests relative to cheap unit tests)
- Room database migration testing — knowing that
androidx.room.testing.MigrationTestHelperis the specific testing class for verifying Room migrations; knowing that a migration test creates a database at the old version, inserts test data, runs the migration, and verifies that the migrated schema and data are correct; this is an integration test that runs on a device/emulator; knowing the specific pattern:val db = migrationTestHelper.createDatabase(TEST_DB, 4)→ insert old schema data →db = migrationTestHelper.runMigrationsAndValidate(TEST_DB, 5, true, MIGRATION_4_5)→ verify data integrity
- LiveData and lifecycle leak testing — knowing that LiveData observer leaks occur when a subscriber is attached to a non-lifecycle-aware owner (a static reference, an
observeForevercall without a correspondingremoveObserver, or a ViewModel outliving the Fragment); knowing thatTestLifecycleOwner(fromandroidx.lifecycle:lifecycle-testing) simulates the lifecycle without a real Activity; knowing the specific pattern for testing ViewModel cleanup: assert that afterviewModel.onCleared()is called (which simulates the ViewModel being destroyed), no active observers remain on anyLiveData
- Idempotency testing for payment flows — knowing that double-submission bugs in payment flows are a class of race condition bug caused by the user tapping a button while the coroutine handling the first tap is still running; the test must simulate concurrent taps on the submit button and assert that only one payment is submitted regardless of how many taps occur; this requires both a ViewModel unit test (testing the idempotency logic with a fake repository) and an E2E test (testing the UI prevents the second tap while the first is in flight)
- Coroutine testing — knowing
kotlinx-coroutines-testand specificallyTestCoroutineDispatcher/StandardTestDispatcher/UnconfinedTestDispatcher; knowing the difference:StandardTestDispatcherdoes not automatically execute coroutines (you must explicitly calladvanceUntilIdle()to advance the virtual time), which allows testing race conditions and intermediate states;UnconfinedTestDispatcherexecutes coroutines immediately, which is appropriate for simple tests that do not need to observe intermediate states
- Shifting the team culture — understanding that test-first development (TDD) is a culture and workflow change, not just a tooling change; knowing the practical mechanisms: making test writing part of the Definition of Done for every user story, adding test coverage to PR review checklists, establishing a "test pairing" practice where QA and the developer write tests together for new features before implementation
2. Framework: Android Testing Strategy Design Model (ATSDM)
- Assumption Documentation — Categorise the existing 12 UI tests: are they testing happy paths or edge cases? How many of the existing tests cover the payment flow? Are the tests written in Espresso or Compose test APIs? Understanding the quality of the existing test suite is as important as understanding its size
- Constraint Analysis — 45-minute UI test suite on the device farm is too slow for PR-level feedback; the strategy must reduce the E2E suite to a smoke test set (10–15 tests, under 10 minutes) and cover the majority of cases with faster unit and integration tests; the device farm cost must also be considered — running 100+ E2E tests on a device farm at scale costs real money per minute
- Tradeoff Evaluation — Robolectric (run Android code on JVM — fast, no emulator required, some Android APIs are imperfect simulations) vs. real device/emulator testing (accurate, slow, expensive); for unit tests of ViewModels and use cases, Robolectric is appropriate; for Room migration tests, only a real emulator produces correct results; for payment confirmation E2E tests, a real emulator is required to correctly test the submission throttling UI behaviour
- Hidden Cost Identification — The biggest hidden cost in Android testing is test maintenance; a poorly designed test that is tightly coupled to the UI structure (referencing view IDs, not semantic labels) breaks on every UI redesign; testing the ViewModel's state rather than the UI elements' properties produces more resilient tests
- Risk Signals / Early Warning Metrics — Code coverage per layer (domain layer target above 85%, ViewModel layer above 75%, repository layer above 70%), flaky test rate in CI (a test that fails intermittently in CI but passes locally is usually a concurrency or timing issue — flaky tests degrade the team's confidence in the CI suite), mutation testing score (a coverage metric that measures whether the test suite actually catches bugs, not just executes code — a mutation score above 60% indicates genuinely useful tests)
- Pivot Triggers — If the Room migration test suite shows migration failures for more than 20% of historical migration paths: the database schema has been evolving without migration tests and there is significant migration debt; schedule a dedicated sprint to write migration tests for all existing migrations before adding new schema changes
- Long-Term Evolution Plan — Sprint 1–2: write the 3 incident-preventing tests; Sprint 3–6: establish testing standards and raise unit test coverage to 60% on new code; Sprint 7–12: raise coverage to 75%, migrate legacy E2E tests to unit tests where appropriate, add macrobenchmark tests for the payment flow performance
3. The Answer
Explicit Assumptions:
- The app uses Room for local storage, Kotlin Coroutines and Flow for async operations, Hilt for dependency injection, and Jetpack Compose for the payment confirmation screen
- The payment confirmation ViewModel uses
suspend fun submitPayment()in a coroutine launched by aButton'sonClickhandler
- The database migration that corrupted data was
MIGRATION_4_5: it dropped and recreated atransactionstable but did not migrate the data from the old schema
Test 1: The Database Migration Corruption — Room Migration Test
The incident: the MIGRATION_4_5 migration dropped the transactions table without migrating existing rows, corrupting data for all users upgrading from version 4.x. The test that would have caught this: a MigrationTest integration test using MigrationTestHelper. Implementation: in the androidTest/ source set (because this requires a real database on an emulator), create TransactionsDatabaseMigrationTest. The test opens a v4 database, inserts 3 test transactions using the old schema's raw SQL, runs the migration to v5, and queries the migrated database to confirm all 3 transactions are present with correct data. Specifically:
@Test
fun migration4To5_preservesTransactionData() {
val db = migrationTestHelper.createDatabase(TEST_DB, 4).apply {
execSQL("INSERT INTO transactions VALUES (1, 'PAYMENT', 100.00, '2024-01-15')")
execSQL("INSERT INTO transactions VALUES (2, 'PAYMENT', 250.00, '2024-01-16')")
close()
}
val migratedDb = migrationTestHelper.runMigrationsAndValidate(
TEST_DB, 5, true, MIGRATION_4_5
)
val cursor = migratedDb.query("SELECT COUNT(*) FROM transactions")
cursor.moveToFirst()
assertThat(cursor.getInt(0)).isEqualTo(2) // Would have been 0 before the fix
}This test would have failed before the fix (revealing the data loss) and passes after the correct migration is implemented. Add this test pattern to the Definition of Done for every future schema change: any Room migration must have a corresponding MigrationTest before the feature is merged.
Test 2: The LiveData Observer Leak — ViewModel Lifecycle Test
The incident: the ViewModel attached a LiveData observer via observeForever() without removing it in onCleared(), causing retained references to stale observers and eventually crashing with a NullPointerException when the observer's target was garbage collected. The test that would have caught this:
@Test
fun viewModel_onCleared_removesAllObservers() {
val viewModel = AccountViewModel(fakeRepository = FakeAccountRepository())
val testOwner = TestLifecycleOwner()
var observedValues = 0
viewModel.accountState.observe(testOwner) { observedValues++ }
// Simulate the ViewModel being destroyed
viewModel.onCleared() // Calls removeObserver() for all registered observers
// After onCleared, no observers should receive new values
fakeRepository.emitNewAccountData(testAccountData)
advanceUntilIdle()
assertThat(observedValues).isEqualTo(0) // New values emitted after onCleared must not reach the observer
}For new code using StateFlow (the correct replacement for observeForever): this entire class of bugs disappears because StateFlow collection via collectAsStateWithLifecycle() in Compose automatically cancels the collection when the Composable leaves the composition, with no manual cleanup required. The migration from LiveData to StateFlow in ViewModels is therefore both a correctness improvement and a testability improvement.
Test 3: Double Payment Submission — Idempotency Unit Test and UI Test
The incident: rapidly tapping the Submit Payment button twice submitted the payment twice, because the ViewModel launched a new coroutine on every onSubmitClicked() call without checking if a submission was already in flight. The ViewModel unit test (JVM, fast):
@Test
fun submitPayment_whileAlreadySubmitting_doesNotSubmitTwice() = runTest {
val fakePaymentRepo = FakePaymentRepository(delayMs = 2000) // Simulates slow network
val viewModel = PaymentViewModel(fakePaymentRepo)
// Simulate two rapid taps
viewModel.onSubmitClicked()
viewModel.onSubmitClicked() // This second call must be ignored
advanceUntilIdle() // Wait for all coroutines to complete
assertThat(fakePaymentRepo.submitCallCount).isEqualTo(1) // Only one submission
}The fix in the ViewModel:
fun onSubmitClicked() {
if (_uiState.value is PaymentUiState.Submitting) return // Guard: ignore if already submitting
viewModelScope.launch {
_uiState.value = PaymentUiState.Submitting
val result = paymentRepository.submitPayment(currentPayment)
_uiState.value = when (result) {
is Result.Success -> PaymentUiState.Success(result.data)
is Result.Error -> PaymentUiState.Error(result.exception.message)
}
}
}The Compose UI test (confirms the button is disabled during submission):
@Test
fun submitButton_whileSubmitting_isDisabledAndNotClickable() {
composeTestRule.setContent {
PaymentConfirmationScreen(
uiState = PaymentUiState.Submitting,
onSubmitClicked = { /* Should not be called */ }
)
}
composeTestRule.onNodeWithText("Submit Payment").assertIsNotEnabled()
}Shifting the Team Culture: Three Mechanisms
(1) Definition of Done includes test coverage: every user story's DoD requires: unit tests for all new use cases and ViewModels with coverage above 70%, an integration test for any new Room migration, and a UI test for any new user-facing interaction on a payment-critical screen. The PR review checklist includes a "test coverage" checkbox that the reviewer must check before approving. (2) Test pairing sessions: for the first 6 weeks, every payment-related feature is developed with a pairing session where the developer and QA engineer write the ViewModel unit tests and the critical UI tests before writing the implementation code (test-first for new features, test-alongside for bug fixes). (3) Make the test suite fast: the 45-minute E2E suite is a deterrent to running tests. Split the suite into 3 tiers in the CI pipeline: Tier 1 (unit tests, under 3 minutes) runs on every PR commit; Tier 2 (integration tests including Room migration tests, under 8 minutes) runs on every PR merge to main; Tier 3 (full E2E smoke test, under 15 minutes with emulator parallelisation) runs nightly and before every release. A developer who gets test feedback in 3 minutes will run tests continuously; one who waits 45 minutes will not.
Early Warning Metrics:
- New code unit test coverage per PR (measured by the
koverGradle plugin) — the PR must not decrease the coverage below the project minimum; a coverage report is automatically posted to the PR by GitHub Actions and the PR is blocked if coverage falls below the project threshold
- Database migration test execution in the release pipeline — every release build must pass all
MigrationTestclasses; a release blocked by a failing migration test before production protects all users from the corruption incident
- Payment flow E2E test suite flakiness rate — the payment confirmation tests are safety-critical; a test that fails intermittently is as dangerous as no test (the team learns to ignore intermittent failures); any payment test that fails more than 3 times in 30 CI runs without a corresponding code change is escalated for immediate investigation
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The specific MigrationTestHelper implementation with the exact SQL insert and row count assertion directly addresses the incident's root cause (data loss during migration) with a test that would have failed before the fix — this is test-driven incident prevention, not generic test coverage advice. The double-submission fix using the if (_uiState.value is PaymentUiState.Submitting) return guard in the ViewModel — combined with the matching Compose UI test that confirms the button is disabled during submission — shows that the same bug must be prevented at both the business logic layer and the UI layer. The 3-tier CI pipeline design (3 minutes for unit tests, 8 minutes for integration, 15 minutes for E2E) addresses the cultural barrier that makes developers avoid running tests: making the fastest feedback loop fast enough to use continuously.
What differentiates it from mid-level thinking: A mid-level Android developer would propose "increase unit test coverage to 80%" without specifying which test type addresses which historical incident, without knowing about MigrationTestHelper, without knowing about TestLifecycleOwner, and without providing the specific ViewModel idempotency guard pattern. They would not know about StandardTestDispatcher.advanceUntilIdle() for testing coroutine timing or the kover plugin for Gradle-integrated coverage reporting.
What would make it a 10/10: A 10/10 response would include the complete FakePaymentRepository implementation for the double-submission test (showing the delayMs parameter and the submitCallCount assertion field), a concrete CI YAML configuration for the 3-tier test pipeline with emulator parallelisation for the E2E tier, and a mutation testing configuration using the Kotlin Mutation Testing tool to measure whether the new tests genuinely catch bugs.
Question 4: Jetpack Compose — Advanced UI Patterns and State Management
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Twitter, Airbnb, Lyft, Cash App
The Question
You are a Senior Android Developer building a real-time collaborative note-taking app in Jetpack Compose. The app has 3 complex UI requirements: (1) A rich text editor that must handle concurrent edits from multiple users in real time — when another user edits the same note, the current user's cursor position must be preserved and the diff must be applied without flickering; (2) A infinite scroll list of notes that loads the next page when the user scrolls near the bottom, handles loading states and errors inline, and must restore the exact scroll position (including progress within a half-loaded page) after process death and recreation; (3) A custom drawing canvas where users can annotate notes with freehand sketches — the canvas must feel responsive with no perceptible latency between the user's touch input and the drawn line, even on mid-range devices. Walk through the Compose state management approach, the specific APIs you would use, and the performance considerations for each of the three requirements.
1. What Is This Question Testing?
- Compose state management depth — understanding the distinction between
remember,rememberSaveable,mutableStateOf,derivedStateOf, andproduceState; knowing when to hoist state to the ViewModel vs. when to keep state local to a Composable; knowing the difference betweenState<T>(Compose's observable state, triggers recomposition) andMutableState<T>(read-write state); and knowing that theViewModel+StateFlowpattern is the correct approach for state that must survive configuration changes
- The Compose rendering pipeline and recomposition — understanding that every state change triggers recomposition of the Composable tree that reads that state; knowing that
derivedStateOfcreates a derived state that only triggers recomposition when its derived value changes (not when its inputs change if the derived value is unchanged); knowing thatLazyColumn'srememberLazyListState()is the mechanism for saving and restoring scroll position, and thatLazyListState.firstVisibleItemIndexandfirstVisibleItemScrollOffsetare the two values needed to restore exact scroll position
- Custom drawing in Compose — knowing that
CanvasComposable usesDrawScopewhich provides direct access to the AndroidCanvasdrawing APIs but on the Compose rendering thread; knowing that touch input for drawing requiresModifier.pointerInput(Unit) { detectDragGestures { ... } }to handle the drag gesture; knowing the performance critical path for freehand drawing: touch input on the UI thread → update thePathstate → trigger recomposition → redraw the Canvas; and knowing that for low-latency drawing, thePathmust be mutated directly (not creating a newPathon every touch point) because Path mutation does not trigger recomposition while rendering the updated Path state does
Paging 3library — knowing that Jetpack Paging 3 is the standard solution for paginated list loading in Android; knowing thePagingSource(defines how to load a page),Pager(creates the Flow ofPagingData), andcollectAsLazyPagingItems()(the Compose integration that providesLazyPagingItems<T>for use inLazyColumn); knowing thatLazyPagingItems.loadState.refresh,loadState.append, andloadState.prependexpose the loading states for each page boundary
- Process death and state restoration — knowing that
rememberSaveablesaves state to theSavedStateHandleand restores it after process death; knowing thatViewModel.SavedStateHandleis the ViewModel-integrated equivalent; and knowing that for complex objects (likeLazyListState), a customSaverimplementation is required because the default saver only handles primitive types; thelistSaver()helper creates aSaverfor objects that can be serialised to a list of primitives
- Concurrent edit conflict resolution in real-time collaboration — understanding that the cursor position preservation problem in a collaborative editor is the Operational Transformation (OT) or CRDT (Conflict-free Replicated Data Type) problem; knowing that from the Android/Compose perspective, the approach is: receive the remote diff as an ordered list of operations (insert at index X, delete character at index Y), apply the operations to the
TextFieldValue.annotatedString, and adjust theTextFieldValue.selectioncursor position based on the operation positions relative to the cursor
2. Framework: Compose Advanced UI Design Model (CAUDM)
- Assumption Documentation — Establish the performance budget for each requirement: rich text editor target is under 16ms recomposition time on mid-range device; infinite scroll target is no visible loading indicator for 80% of scroll events (pre-fetching is sufficient); custom drawing canvas target is under 8ms from touch event to pixel on screen (this requires hardware-accelerated drawing, not software rendering)
- Constraint Analysis — Jetpack Compose's recomposition model is single-threaded on the main thread by default; any state change that triggers recomposition adds to the main thread's work budget; the canvas drawing case requires that path updates do not trigger expensive full recompositions
- Tradeoff Evaluation —
TextFieldValuevs. a custom rich text state:TextFieldwithTextFieldValuehandles basic cursor position preservation; rich text with concurrent edits requires a customAnnotatedStringbased editor becauseBasicTextFielddoes not natively support OT or CRDT; the tradeoff between using a third-party rich text library (e.g.,RichTextEditorfrom the Compose Rich Text library) vs. building custom is build-vs-buy
- Hidden Cost Identification — The
Paging 3+LazyColumnscroll position restoration after process death is a commonly overlooked requirement; simply usingrememberSaveable(lazyListState)does not work forLazyListStatebecause it is not a primitive type; the customSaverimplementation forLazyListStaterestoration is a 30-line addition that is non-obvious but prevents the jarring "scroll position reset to top" experience after process death
- Risk Signals / Early Warning Metrics — Recomposition count using Android Studio's "Recomposition highlights" in the Layout Inspector (a Composable that recomposes more than once per user action is likely reading state it does not need to read — refactor with
derivedStateOf), canvas frame time measured byFrameMetricsAggregator(drawing latency above 8ms is perceptible to users during fast drawing), scroll jank percentage (should be under 5% for the paginated list with pre-fetching configured correctly)
- Pivot Triggers — If the rich text editor's concurrent edit conflict resolution causes visible cursor jumps on more than 10% of concurrent edit events: the OT implementation has an ordering bug; switch to a CRDT-based approach (specifically, a
RGA— Replicated Growable Array — which is the standard CRDT for collaborative text editing) which is inherently conflict-free and does not require server-side ordering
- Long-Term Evolution Plan — Stabilise the three complex UI components with comprehensive Compose UI tests using
createComposeRule()and screenshot tests (Paparazzi); abstract each component into a standalone library module (:ui:rich-text-editor,:ui:infinite-scroll,:ui:drawing-canvas) with their own showcase apps for isolated development
3. The Answer
Requirement 1: Concurrent Rich Text Editor
State design: the editor's state is a TextFieldValue which contains the AnnotatedString (the styled text content) and the TextRange selection (cursor position). When a remote edit arrives (from the WebSocket connection), the ViewModel receives an Operation (either Insert(position: Int, text: String) or Delete(position: Int, length: Int)) and must apply it to the current TextFieldValue while adjusting the cursor position. The cursor position adjustment algorithm: for an Insert(position, text) operation, if the cursor is after the insert position, the cursor must be shifted right by text.length; if the cursor is before or at the insert position, the cursor stays put. For a Delete(position, length) operation, if the cursor is within the deleted range, move it to the start of the deletion; if after the deleted range, shift left by length. ViewModel state:
private val _editorState = MutableStateFlow(TextFieldValue())
val editorState: StateFlow<TextFieldValue> = _editorState.asStateFlow()
fun applyRemoteOperation(op: Operation) {
val current = _editorState.value
val newAnnotatedString = current.annotatedString.applyOperation(op)
val newSelection = op.adjustSelection(current.selection)
_editorState.value = TextFieldValue(
annotatedString = newAnnotatedString,
selection = newSelection
)
}Compose UI: BasicTextField(value = editorState, onValueChange = viewModel::onLocalEdit) — BasicTextField (not TextField) is used because it gives direct control over the TextFieldValue, allowing the custom AnnotatedString styles (bold, italic, headers) to be reflected in the editor. The key recomposition optimisation: BasicTextField's value parameter accepts TextFieldValue; Compose recomposes the text field only when TextFieldValue changes; the applyRemoteOperation() function creates a new TextFieldValue with the updated string and selection, which triggers exactly one recomposition of the BasicTextField — no flickering.
Requirement 2: Infinite Scroll with Scroll Position Restoration After Process Death
Paging 3 setup:
// In the ViewModel
val notesPagingFlow: Flow<PagingData<Note>> = Pager(
config = PagingConfig(
pageSize = 20,
prefetchDistance = 5, // Prefetch 5 items ahead to prevent visible loading
enablePlaceholders = false
)
) {
notesPagingSource
}.flow.cachedIn(viewModelScope) // cachedIn prevents re-fetching on recompositionScroll position restoration after process death — the custom Saver for LazyListState:
// In the Composable
val lazyListState = rememberSaveable(saver = LazyListStateSaver) {
LazyListState()
}
val LazyListStateSaver = listSaver<LazyListState, Int>(
save = { listOf(it.firstVisibleItemIndex, it.firstVisibleItemScrollOffset) },
restore = { LazyListState(it[0], it[1]) }
)The listSaver stores firstVisibleItemIndex and firstVisibleItemScrollOffset as primitives in the SavedStateRegistry. After process death and re-creation, LazyListState is restored with the exact scroll position, including partial scroll within the visible item. Inline loading and error states in the LazyColumn:
LazyColumn(state = lazyListState) {
items(count = notes.itemCount) { index ->
notes[index]?.let { note -> NoteCard(note) }
}
notes.loadState.apply {
when {
refresh is LoadState.Loading -> item { LoadingIndicator() }
append is LoadState.Loading -> item { LoadingMoreIndicator() }
refresh is LoadState.Error -> item { ErrorRetryCard(onRetry = notes::retry) }
append is LoadState.Error -> item { AppendErrorCard(onRetry = notes::retry) }
}
}
}Requirement 3: High-Performance Drawing Canvas
The key insight for low-latency drawing: do not store the path as an immutable List<Offset> (which requires creating a new list on every touch point and triggers recomposition). Instead, use a mutable Path object that is mutated in-place on touch events — the mutation does not trigger recomposition, but the Canvas Composable re-renders because the Path reference does not change. To trigger the Canvas to re-render with the updated path, use a separate var drawTick by mutableStateOf(0) counter that increments on each touch event — the Canvas Composable reads drawTick, so it recomposes on each increment, re-drawing the Path at its current (mutated) state:
@Composable
fun DrawingCanvas() {
val path = remember { Path() }
var drawTick by remember { mutableStateOf(0) }
Canvas(
modifier = Modifier
.fillMaxSize()
.pointerInput(Unit) {
detectDragGestures(
onDragStart = { offset ->
path.moveTo(offset.x, offset.y)
drawTick++
},
onDrag = { change, _ ->
path.lineTo(change.position.x, change.position.y)
drawTick++ // Increments trigger Canvas recomposition
}
)
}
) {
val _ = drawTick // Read drawTick in the DrawScope to establish the recomposition dependency
drawPath(
path = path,
color = Color.Black,
style = Stroke(width = 4.dp.toPx(), cap = StrokeCap.Round)
)
}
}Performance: the Path.lineTo() call on each onDrag event takes approximately 0.1ms; the Canvas recomposition and redraw of the Path takes approximately 2–4ms on a mid-range device with hardware acceleration enabled. Total latency from touch event to pixel: approximately 3–5ms — well within the 8ms perceptible latency threshold. Hardware acceleration: ensure the Canvas Composable's parent has graphicsLayer() applied, which promotes it to a separate hardware layer and prevents the canvas redraws from triggering recomposition of the parent layout. For complex paths (more than 500 points), consider splitting the path into segments: paths with thousands of points slow the GPU's path rasterisation; draw completed segments into a Picture (immutable recorded drawing) and only re-render the current in-progress segment.
Early Warning Metrics:
- Recomposition count per user interaction in the rich text editor — use Android Studio's Recomposition highlights during development to confirm that a remote operation triggers exactly one recomposition of the editor, not a cascade through parent Composables
- Scroll restoration test — an automated UI test that puts the app in the background, kills the process (
adb shell am kill <package>), relaunches, and asserts that thelazyListState.firstVisibleItemIndexmatches the value before process kill
- Canvas draw latency (P95 from
detectDragGesturesonDrag to Canvas redraw) — measured via customCanvas.drawContext.canvas.nativeCanvas.peekBitmap()timing in a benchmark test; P95 above 8ms triggers investigation of hardware layer configuration
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The drawTick counter pattern for triggering Canvas recomposition without creating new Path objects on every touch event is the specific non-obvious Compose optimisation that prevents the "list of Offsets" anti-pattern that creates GC pressure and recomposition overhead on every drag event — this is knowledge that comes from actually building drawing applications in Compose. The custom LazyListStateSaver with listSaver() storing firstVisibleItemIndex and firstVisibleItemScrollOffset directly addresses the commonly missed scroll restoration requirement with the exact implementation. The cursor position adjustment algorithm for OT (shift right for insert after cursor, stay for insert before) is the specific mathematical logic that makes the collaborative editor not have cursor jumping.
What differentiates it from mid-level thinking: A mid-level Android developer would use rememberSaveable(lazyListState) for scroll restoration (which does not work for complex objects like LazyListState) without knowing about listSaver(), would create a new List<Offset> on every touch event in the drawing canvas (causing GC pressure), and would not know about derivedStateOf or cachedIn(viewModelScope) for the paging flow.
What would make it a 10/10: A 10/10 response would include the complete AnnotatedString.applyOperation() extension function implementation (showing the text insertion/deletion with annotation range adjustment), a Paparazzi screenshot test setup for the DrawingCanvas golden image regression testing, and a complete MacrobenchmarkRule test configuration for measuring cold startup performance of the collaborative note editor.
Question 5: Security — Implementing Secure Data Storage and Network Communication on Android
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Square, Signal, Revolut, Bitwarden
The Question
You are a Senior Android Developer at a healthcare company building an Android app that stores and transmits sensitive patient health records. The app must comply with HIPAA (Health Insurance Portability and Accountability Act) for US users and NHS Digital's DSPT (Data Security and Protection Toolkit) for UK users. A third-party security audit has identified 4 vulnerabilities: (1) The app stores the user's authentication token in SharedPreferences in plaintext — readable by any app on a rooted device; (2) All API traffic is encrypted with TLS, but the app accepts any valid TLS certificate including self-signed ones, making it vulnerable to man-in-the-middle attacks on public Wi-Fi; (3) The app's SQLite database containing patient records is not encrypted; (4) Sensitive patient data (medication names, diagnoses) appears in the Android Recent Apps screen as a screenshot of the last viewed screen. Walk through the specific technical remediation for each vulnerability, the Android APIs and libraries involved, and the additional security practices that a healthcare app should implement beyond fixing these four issues.
1. What Is This Question Testing?
- Android Keystore and EncryptedSharedPreferences — knowing that the Android Keystore System provides hardware-backed key storage on devices with a TEE (Trusted Execution Environment) or StrongBox; keys stored in the Keystore cannot be extracted from the device (even on rooted devices); knowing that
EncryptedSharedPreferencesfromandroidx.security.cryptouses the Android Keystore to encrypt shared preferences keys and values; and knowing that for an authentication token,EncryptedSharedPreferencesis the correct storage solution (not Room database, not a custom encryption scheme)
- Certificate pinning — knowing that TLS encryption ensures traffic is encrypted but does not verify that the server is who it claims to be if the attacker has a fraudulent certificate trusted by the system's CA store; certificate pinning specifies the exact public key or certificate fingerprint that the app will accept, preventing connections to servers with different certificates even if they are signed by a trusted CA; knowing that OkHttp's
CertificatePinneris the standard implementation in Android; knowing the tradeoff: pinning the leaf certificate (exact match, breaks on certificate renewal) vs. pinning the intermediate CA (more resilient, still prevents MITM) vs. pinning the public key hash (most resilient — does not change on certificate renewal)
- SQLCipher for encrypted databases — knowing that Android's built-in SQLite database is not encrypted; knowing that
SQLCipher(from Zetetic) is the standard encrypted SQLite library for Android that integrates with Room through theSQLCipher for AndroidRoom adapter; knowing that SQLCipher uses AES-256-CBC encryption for the database file; knowing that the passphrase used to encrypt the database should be derived from the user's authentication credentials (not stored in plaintext) or generated and stored in the Android Keystore
- FLAG_SECURE for sensitive screens — knowing that
WindowManager.LayoutParams.FLAG_SECUREprevents the Android system from capturing screenshots of the flagged window, which prevents sensitive content from appearing in the Recent Apps screen and prevents screenshots from being taken; knowing that in Compose, this is applied at theActivitylevel viawindow.setFlags(WindowManager.LayoutParams.FLAG_SECURE, WindowManager.LayoutParams.FLAG_SECURE)in the Activity'sonCreate(), or selectively usingSecureWindowComposable wrapper; knowing the tradeoff: applyingFLAG_SECUREglobally prevents all screenshots (useful for banking apps) vs. applying it only to sensitive screens (better UX, requires identifying which screens contain PHI)
- Additional healthcare security practices — knowing practices beyond the four fixes: root detection (a rooted device has bypassed Android's security model — healthcare apps should warn users on rooted devices and consider refusing to run), biometric authentication using
BiometricPromptAPI (requiring fingerprint or face authentication to unlock the app after background), automatic session timeout (HIPAA requires automatic session termination after a period of inactivity), certificate transparency enforcement, network security config inres/xml/network_security_config.xml, and ProGuard/R8 obfuscation to prevent reverse engineering of the business logic
- Regulatory compliance specifics — knowing that HIPAA's Technical Safeguard requirements for mobile applications include: automatic logoff, encryption of PHI in transit and at rest, unique user authentication, and audit controls (logging all access to PHI); knowing that NHS DSPT requires data protection impact assessments (DPIAs) for mobile apps handling patient data and annual security assessments
2. Framework: Android Security Remediation Model (ASRM)
- Assumption Documentation — Confirm the minimum API level (API 23/Android 6.0 is the minimum for full Android Keystore hardware-backed key support; below API 23, the Keystore is software-backed and less secure); confirm whether the app supports offline mode (if so, the database passphrase management is more complex — it cannot be derived from a server session token that is unavailable offline)
- Constraint Analysis — HIPAA Technical Safeguards and NHS DSPT compliance are legal requirements, not optional improvements; the 4 vulnerabilities must be remediated before the app can be distributed to healthcare providers; the timeline for remediation is driven by the audit report's finding severity (3 of the 4 are "Critical" severity in a healthcare context)
- Tradeoff Evaluation — For certificate pinning: leaf certificate pinning (strongest, breaks on renewal) vs. public key pinning (resilient to renewal, requires coordination with the server team to pre-generate next rotation's public key) vs. CA pinning (weakest of the three, but simplest to maintain); for a healthcare app, public key pinning is the correct balance — it is stronger than CA pinning while being more maintenance-friendly than leaf certificate pinning
- Hidden Cost Identification — SQLCipher database migration: adding encryption to an existing unencrypted Room database requires a migration strategy; existing users have an unencrypted database file; the migration must decrypt the existing database (no decryption needed — it is currently unencrypted) and re-encrypt it with SQLCipher on the user's first launch after the update; this migration must be done in a background thread with a migration progress screen (it can take 10–30 seconds for large databases)
- Risk Signals / Early Warning Metrics — Certificate pinning failure rate in production (if the pinning check fails due to a certificate rotation that was not coordinated, all users will be unable to connect; monitor the OkHttp
CertificatePinnerfailure rate in Firebase Crashlytics), database encryption migration success rate (monitor the percentage of users who successfully complete the SQLCipher migration on update — a failure leaves the database unencrypted and must trigger a retry mechanism), biometric authentication failure rate (monitor the percentage of users who fail biometric authentication more than 3 times in a session — this may indicate the biometric prompt is misconfigured for some device models)
- Pivot Triggers — If the certificate pinning deployment causes a 30%+ increase in API error rates within 24 hours of release (indicating a certificate rotation that was not communicated to the mobile team): deploy a hotfix that reverts the certificate pin to the previous value while the server team coordinates the correct certificate hash for the next rotation
- Long-Term Evolution Plan — Month 1: remediate the 4 critical vulnerabilities; Month 2: add biometric authentication and automatic session timeout; Month 3: root detection and emulator detection; Month 4: penetration test by a different security firm to validate the remediations; annually: security assessment per NHS DSPT requirements
3. The Answer
Vulnerability 1: Authentication Token in Plaintext SharedPreferences — EncryptedSharedPreferences
The fix: replace SharedPreferences with EncryptedSharedPreferences from androidx.security.crypto. The token is encrypted using AES-256-GCM with a key stored in the Android Keystore that is hardware-backed on devices with a TEE. Implementation:
private val masterKey = MasterKey.Builder(context)
.setKeyScheme(MasterKey.KeyScheme.AES256_GCM)
.setUserAuthenticationRequired(false) // Set true to require biometric for key access
.build()
private val encryptedPreferences = EncryptedSharedPreferences.create(
context,
"secure_prefs",
masterKey,
EncryptedSharedPreferences.PrefKeyEncryptionScheme.AES256_SIV,
EncryptedSharedPreferences.PrefValueEncryptionScheme.AES256_GCM
)
// Storing the token — identical API to SharedPreferences
encryptedPreferences.edit().putString(KEY_AUTH_TOKEN, token).apply()
// Retrieving the token
val token = encryptedPreferences.getString(KEY_AUTH_TOKEN, null)The encrypted preferences file on the device is an unreadable ciphertext blob even on a rooted device — the decryption key is protected by the Android Keystore and never leaves the secure hardware. Additionally: configure the master key with setUserAuthenticationRequired(true) and a validity duration of 300 seconds — this requires the user to re-authenticate with biometrics every 5 minutes to unlock the keystore key, which satisfies HIPAA's unique user authentication and automatic logoff requirements simultaneously.
Vulnerability 2: TLS Without Certificate Pinning — OkHttp CertificatePinner
The fix: add public key pinning to the OkHttp client. Pin the SHA-256 hash of the server's public key (not the certificate, which changes on renewal — the public key remains stable across certificate renewals if the same key pair is used). Get the pin hash: openssl s_client -connect api.healthapp.com:443 | openssl x509 -pubkey -noout | openssl pkey -pubin -outform der | openssl dgst -sha256 -binary | base64. Configure OkHttp:
val certificatePinner = CertificatePinner.Builder()
.add("api.healthapp.com", "sha256/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=") // Current key
.add("api.healthapp.com", "sha256/BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB=") // Backup key for rotation
.build()
val okHttpClient = OkHttpClient.Builder()
.certificatePinner(certificatePinner)
.build()Always configure two pins: the current public key and the backup key for the next rotation. The server team must pre-generate the next rotation's key pair before the current certificate expires, so both hashes are known in advance. Additionally: add a network_security_config.xml in res/xml/ that disables cleartext (HTTP) traffic entirely for all domains (already required for API 28+, but explicit configuration documents the intent):
<network-security-config>
<domain-config cleartextTrafficPermitted="false">
<domain includeSubdomains="true">healthapp.com</domain>
</domain-config>
</network-security-config>Vulnerability 3: Unencrypted SQLite Database — SQLCipher with Room
The fix: replace the unencrypted Room database with an SQLCipher-encrypted database. The passphrase is generated using SecureRandom and stored in the Android Keystore (not in code, not in SharedPreferences — in the Keystore, the same hardware-backed secure storage used for the authentication token). Database passphrase management:
// Generate and store the passphrase in the Keystore on first launch
fun getOrCreateDatabasePassphrase(): ByteArray {
val alias = "patient_db_key"
return if (keyStore.containsAlias(alias)) {
// Retrieve existing passphrase — wrapped/unwrapped by the Keystore key
keystoreWrapper.unwrapKey(alias)
} else {
// Generate a new 32-byte passphrase
val passphrase = ByteArray(32).also { SecureRandom().nextBytes(it) }
keystoreWrapper.wrapAndStore(alias, passphrase)
passphrase
}
}Room + SQLCipher integration:
val passphrase = getOrCreateDatabasePassphrase()
val factory = SupportFactory(passphrase)
val database = Room.databaseBuilder(context, PatientDatabase::class.java, "patient_db")
.openHelperFactory(factory) // SQLCipher replaces the default SQLiteOpenHelper
.addMigrations(/* ... */)
.build()
// Clear the passphrase from memory immediately after database construction
passphrase.fill(0)Migration of existing users (those who have the unencrypted database on their device): on first launch after the update, detect that the existing database is unencrypted (attempt to open with SQLCipher; if it fails with a "file is not a database" error, the existing database is unencrypted and must be migrated). Run the migration on a background thread: open the existing unencrypted database, export all data, close it, create a new SQLCipher database with the Keystore-derived passphrase, import all data, delete the old unencrypted database file. Show a "Securing your health records..." progress screen during this migration.
Vulnerability 4: PHI in Recent Apps Screenshot — FLAG_SECURE
The fix: apply FLAG_SECURE to all Activities that display PHI. Since the app uses a single-Activity architecture with Compose, apply it globally in MainActivity.onCreate() while providing a mechanism to temporarily lift the flag for non-sensitive screens (the home screen and settings screen do not contain PHI and can allow screenshots for accessibility purposes).
class MainActivity : ComponentActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
window.setFlags(
WindowManager.LayoutParams.FLAG_SECURE,
WindowManager.LayoutParams.FLAG_SECURE
)
setContent {
HealthApp()
}
}
}For the non-PHI screens where screenshots should be allowed: use a Compose DisposableEffect to temporarily remove the flag when the non-sensitive screen is displayed and restore it when the screen leaves the composition:
@Composable
fun AllowScreenshots() {
val window = (LocalContext.current as Activity).window
DisposableEffect(Unit) {
window.clearFlags(WindowManager.LayoutParams.FLAG_SECURE)
onDispose {
window.setFlags(
WindowManager.LayoutParams.FLAG_SECURE,
WindowManager.LayoutParams.FLAG_SECURE
)
}
}
}Additional Healthcare Security Practices
Beyond the four fixes: (1) Biometric authentication with BiometricPrompt: require fingerprint or face authentication when the app comes to the foreground after more than 5 minutes in the background; implement with BiometricManager.canAuthenticate(BIOMETRIC_STRONG) to confirm the device supports strong biometrics before requiring it. (2) Automatic session timeout: a LifecycleObserver in the Application class tracks the time since the app was last in the foreground; after 15 minutes of background time, invalidate the authentication token and require re-login on next foreground. (3) PHI access audit logging: every read of patient data must be logged with the user ID, the accessed record ID, the timestamp, and the device ID; these audit logs are stored locally (encrypted in Room) and synced to the server at the next available opportunity; HIPAA requires a 6-year retention of access audit logs. (4) R8 obfuscation: enable full R8 obfuscation in the release build to prevent reverse engineering of the cryptographic key derivation logic and the patient data models. (5) Root detection: use the RootBeer library (open source) to detect rooted devices; warn the user that using a healthcare app on a rooted device reduces the security of their health data; consider refusing to display PHI on rooted devices per the organisation's security policy.
Early Warning Metrics:
- EncryptedSharedPreferences decryption failure rate in Crashlytics — a
KeyStoreExceptionduring token decryption indicates the Keystore key has been invalidated (can happen if the device's lock screen is removed or biometrics are changed); when this occurs, the user must re-authenticate and a new key is provisioned; monitor the rate to detect unexpected key invalidation patterns
- Certificate pin failure rate — a sudden spike in
CertificatePinnerfailures indicates either a server certificate rotation that was not communicated to the mobile team or an active MITM attack; both scenarios require an immediate engineering response
- SQLCipher migration success rate in the first 7 days post-update release — the percentage of users who successfully complete the database encryption migration; below 95% success rate triggers an investigation into which device types are failing and whether a specific database size threshold causes the migration to timeout
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The public key pinning selection (not leaf certificate pinning, not CA pinning) with the specific rationale (public key remains stable across renewals if the same key pair is used) and the two-pin configuration (current + backup for rotation coordination) is the production-hardened certificate pinning design that avoids the most common certificate pinning operational failure (pins breaking on renewal because only one pin was configured). The SQLCipher passphrase management — generating a random 32-byte key, storing it in the Android Keystore, and zeroing the byte array from memory immediately after database construction — shows security engineering depth that goes beyond "use SQLCipher." The PHI audit logging requirement for HIPAA (6-year retention, every PHI access logged with user/record/timestamp/device) is the regulatory compliance knowledge that separates an Android developer who has built healthcare apps from one who is guessing.
What differentiates it from mid-level thinking: A mid-level Android developer would replace SharedPreferences with EncryptedSharedPreferences (correct), add OkHttp certificate pinning (correct), add SQLCipher (correct), and add FLAG_SECURE (correct) — but would not provide the public key hash generation command, would not know about the two-pin backup configuration, would not design the passphrase key-wrapping with the Keystore, would not design the existing database migration strategy, and would not know about HIPAA's audit logging requirement.
What would make it a 10/10: A 10/10 response would include the complete Keystore key wrapping/unwrapping implementation for the SQLCipher passphrase (showing the KeyGenerator and KeyStore API usage), a concrete BiometricPrompt implementation showing the CryptoObject integration with the Keystore-backed cipher, and a specific HIPAA Technical Safeguard compliance checklist mapped to each of the implemented security controls.
Question 6: Kotlin Coroutines and Flow — Advanced Concurrency Patterns in Production
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: JetBrains, Google, Kotlin Foundation, Cash App, Monzo
The Question
You are a Senior Android Developer at a ride-hailing company. The app tracks the driver's real-time location, displays a live map with nearby drivers, sends and receives chat messages with the rider, and calculates the ETA displayed on a countdown timer. All four data streams update simultaneously and must be displayed in a single RideViewModel. A code review has flagged several issues in the current implementation: the location updates use GlobalScope.launch (a leaked coroutine that is not cancelled when the user leaves the screen), the map and chat data streams are collected in separate lifecycleScope.launchWhenStarted blocks (now deprecated), the ETA countdown is implemented with a Handler and Runnable posted every second (leaks memory and does not respect lifecycle), and when the fare calculation API call fails, the error is swallowed silently (the catch block calls println() and does nothing else). Refactor all four implementations using idiomatic, production-quality Kotlin Coroutines and Flow patterns, explaining the specific bug each refactor fixes and why the correct pattern prevents it.
1. What Is This Question Testing?
- Structured concurrency and scope — understanding that
GlobalScopeis an anti-pattern because the launched coroutines are not bound to any lifecycle; they continue running even after the Activity or ViewModel is destroyed; the correct replacement isviewModelScope.launch(automatically cancelled when the ViewModel is cleared) for ViewModel-initiated coroutines andlifecycleScope.launch(automatically cancelled at the lifecycle boundary) for UI-layer coroutines; knowing thatviewModelScopeis backed bySupervisorJob— a child coroutine failure does not cancel siblings
repeatOnLifecycleandcollectAsStateWithLifecycle— knowing thatlaunchWhenStarteddoes not cancel the coroutine when the lifecycle drops below STARTED — it merely suspends the collection; this means the producer (the Flow upstream) continues running and buffering items even when the app is in the background (consuming resources and potentially collecting sensitive location data unnecessarily);repeatOnLifecycle(Lifecycle.State.STARTED)properly cancels the collection when the lifecycle drops below STARTED and restarts it when it rises back — this is the correct pattern for UI-layer Flow collection
StateFlowvsSharedFlow— knowing the differences:StateFlowalways has a value (requires an initial value), emits only when the value changes, and new collectors always receive the most recent value — appropriate for UI state;SharedFlowhas no initial value, can buffer multiple values, and is appropriate for events (navigation events, one-time error messages) that should not be replayed to new collectors; knowing thatMutableStateFlowis the canonical ViewModel UI state holder
- Combining multiple Flows with
combine— knowing thatFlow.combine(flow1, flow2, flow3) { a, b, c -> ... }emits a new combined value whenever any of the input flows emits, using the most recent value from each other flow; this is the correct pattern for combining real-time data streams (location + nearby drivers + chat + ETA) into a singleUiStatethat the Composable collects once
tickerFlowand timer implementation — knowing that a countdown timer in Kotlin Coroutines is implemented as aflow { while(true) { emit(Unit); delay(1_000) } }(a flow that emits every second), which is properly cancelled when the collecting coroutine is cancelled; this replaces theHandler/Runnablepattern that requires manualremoveCallbackscleanup
- Coroutine exception handling — knowing the two mechanisms:
try/catchinside a coroutine (catches exceptions at the call site),CoroutineExceptionHandler(catches unhandled exceptions in a root coroutine's scope), andFlow.catchoperator (catches upstream Flow exceptions and allows recovery or error state emission); knowing that a silentcatchthat only logs is a correctness bug — the UI must receive the error state to show the user an error message
2. Framework: Kotlin Coroutines Production Pattern Model (KCPPM)
- Assumption Documentation — Identify whether each data stream is: a one-time value (use
suspend funcalled fromviewModelScope.launch), a continuous stream that the ViewModel produces (useFlowfrom the repository, transformed and exposed asStateFlow), or a stream that the ViewModel must react to continuously (useviewModelScope.launchwithcollect); the categorisation determines the correct coroutine/Flow pattern
- Constraint Analysis — The ViewModel must survive configuration changes (rotation) without restarting the streams;
viewModelScopeis tied to the ViewModel's lifecycle (not the Activity's lifecycle), so streams started inviewModelScope.launchautomatically survive rotation
- Tradeoff Evaluation — Exposing a
StateFlowvs. a coldFlowfrom the ViewModel: aStateFlowhas an initial value and always has a current value — new collectors (after a rotation) immediately receive the last emitted state; a coldFlowre-executes from the beginning for each collector — inappropriate for streams that should not restart on rotation (location tracking, WebSocket connections)
- Hidden Cost Identification —
combinewaits for all input flows to emit at least one value before emitting the first combined value; if the chat stream has never received a message,combinewill not emit a combined state until the first message arrives; usecombinewith nullable or default-value initial states to prevent this blocking
- Risk Signals / Early Warning Metrics — Coroutine leak detection using
LeakCanary(the same tool that detects memory leaks also detects leaked coroutines in non-viewModelScopeusage), ANR rate caused by coroutine dispatcher contention (Dispatchers.Main.immediatevs.Dispatchers.Main— using the immediate dispatcher in a tight loop can starve the main thread)
- Pivot Triggers — If the WebSocket chat stream emits at 100 messages/second (a burst during high-traffic events), the
combineoperator will trigger 100 recompositions per second; applydebounce(50)to the chat stream within thecombineto limit recompositions to 20/second without losing any messages
- Long-Term Evolution Plan — Abstract the four data streams into a single
RideRepositorythat exposes oneFlow<RideState>combining all data sources; the ViewModel simply collects this single stream; this reduces the ViewModel's complexity and makes the combination logic independently testable
3. The Answer
Bug 1: GlobalScope — The Coroutine That Never Dies
The problem: GlobalScope.launch { locationManager.startLocationUpdates() } creates a coroutine that is not scoped to any lifecycle. When the user leaves the ride screen, the ViewModel is cleared, but the location updates coroutine continues running indefinitely — consuming battery, GPS, and network resources until the process is killed. The fix: move to viewModelScope:
kotlin
// BEFORE (broken)
init {
GlobalScope.launch {
locationManager.locationUpdates.collect { location ->
_driverLocation.value = location
}
}
}
// AFTER (correct)
init {
viewModelScope.launch {
locationManager.locationUpdates.collect { location ->
_driverLocation.value = location
}
}
}
viewModelScope is automatically cancelled in ViewModel.onCleared() — when the user navigates away, the location collection stops immediately. For the repository-level location flow, use the stateIn operator to convert the cold Flow into a hot StateFlow that survives configuration changes:
kotlin
val driverLocation: StateFlow<LatLng?> = locationRepository.locationUpdates
.stateIn(
scope = viewModelScope,
started = SharingStarted.WhileSubscribed(5_000), // Stop 5s after last subscriber — allows for rotation
initialValue = null
)
SharingStarted.WhileSubscribed(5_000) keeps the upstream flow active for 5 seconds after the last subscriber disconnects — this handles configuration changes (rotation takes under 1 second) without restarting the location updates stream.
Bug 2: launchWhenStarted — The Silent Background Listener
The problem: lifecycleScope.launchWhenStarted { nearbyDriversFlow.collect { ... } } suspends the collection when the app goes to the background, but the upstream Flow producer continues running and buffering items. For a WebSocket stream of nearby drivers at scale (hundreds of driver position updates per minute), this creates a large buffer of unconsumed items that are all replayed when the app comes back to the foreground — causing a burst of UI updates. It also means the WebSocket connection stays open in the background. The fix: repeatOnLifecycle:
kotlin
// BEFORE (broken — in Fragment/Activity)
lifecycleScope.launchWhenStarted {
viewModel.nearbyDrivers.collect { drivers -> updateMap(drivers) }
}
// AFTER (correct — in Fragment/Activity)
lifecycleScope.launch {
viewLifecycleOwner.repeatOnLifecycle(Lifecycle.State.STARTED) {
viewModel.nearbyDrivers.collect { drivers -> updateMap(drivers) }
}
}
repeatOnLifecycle(STARTED) cancels the collect (and the upstream collection) when the lifecycle drops below STARTED and restarts it when the lifecycle rises back to STARTED. For Compose, use collectAsStateWithLifecycle() which does the same thing:
kotlin
// In Compose (equivalent to repeatOnLifecycle)
val nearbyDrivers by viewModel.nearbyDrivers.collectAsStateWithLifecycle()
Bug 3: Handler/Runnable Timer — The Leaked Background Thread
The problem: a Handler(Looper.getMainLooper()).postDelayed(runnable, 1000) chain requires manual removeCallbacks(runnable) cleanup; if the Fragment is destroyed before the Runnable is cancelled, the Runnable holds a reference to the Fragment's view and causes a memory leak. The fix: a coroutine-based ticker flow:
kotlin
// ViewModel
private fun tickerFlow(periodMs: Long) = flow {
while (true) {
emit(Unit)
delay(periodMs)
}
}
val etaCountdown: StateFlow<Int> = tickerFlow(1_000)
.map { calculateRemainingSeconds() }
.stateIn(
scope = viewModelScope,
started = SharingStarted.WhileSubscribed(5_000),
initialValue = estimatedArrivalSeconds
)
The tickerFlow is automatically cancelled when viewModelScope is cancelled — no manual cleanup required. The delay(periodMs) in the flow is a suspending function that respects coroutine cancellation — if the ViewModel is cleared mid-delay, the coroutine is cancelled immediately without waiting for the delay to complete. Unlike Handler.postDelayed, this approach has no memory leak risk because the coroutine holds no reference to any View.
Bug 4: Swallowed Exception — Silent Failure in the Fare API Call
The problem: the catch block calls println() and does nothing else, meaning the user sees no error when the fare calculation fails — they see a spinner that never resolves. In production, this is a user experience failure and a diagnostic gap. The fix: structured error handling that emits an error state to the UI:
kotlin
// BEFORE (broken)
fun calculateFare() {
viewModelScope.launch {
try {
val fare = fareRepository.calculateFare(rideId)
_uiState.value = RideUiState.FareCalculated(fare)
} catch (e: Exception) {
println("Error calculating fare: $e") // Silently swallowed
}
}
}
// AFTER (correct)
fun calculateFare() {
viewModelScope.launch {
_uiState.value = RideUiState.Loading
fareRepository.calculateFare(rideId)
.onSuccess { fare ->
_uiState.value = RideUiState.FareCalculated(fare)
}
.onFailure { error ->
_uiState.value = RideUiState.Error(
message = error.toUserFacingMessage(),
retryAction = ::calculateFare
)
Timber.e(error, "Fare calculation failed for ride $rideId")
}
}
}
The RideUiState.Error state causes the Compose UI to display an error message with a "Retry" button. The Timber.e() call ensures the error appears in production monitoring (Firebase Crashlytics via Timber tree). Making the error state include a retryAction lambda (the ::calculateFare function reference) means the Compose UI can call viewModel.retryAction() from the retry button without knowing which specific operation failed.
Combining All Four Streams into One UiState
With all four streams properly lifecycle-aware, combine them in the ViewModel:
kotlin
val uiState: StateFlow<RideUiState> = combine(
driverLocation,
nearbyDriversFlow,
chatMessagesFlow,
etaCountdown
) { location, drivers, messages, eta ->
RideUiState.Active(
driverLocation = location,
nearbyDrivers = drivers,
chatMessages = messages,
etaSeconds = eta
)
}.stateIn(
scope = viewModelScope,
started = SharingStarted.WhileSubscribed(5_000),
initialValue = RideUiState.Loading
)
The Compose UI collects a single uiState with collectAsStateWithLifecycle() — one collector, four data streams, zero memory leaks.
Early Warning Metrics:
- LeakCanary reports in the debug build — any coroutine-related leak appears as a
CoroutineContextorJobobject in the LeakCanary leak trace; a weekly review of LeakCanary reports in the CI pipeline catches coroutine leaks before they reach production
- Background location access log (Firebase Performance custom trace) — trace the duration between the app going to background and location updates stopping; with the
GlobalScopefix, this should be under 100ms (the time forviewModelScope.onCleared()to propagate); above 5 seconds indicates a lingering coroutine
SharingStarted.WhileSubscribedreconnection rate — the number of times per session that the upstream flows reconnect (indicating the user backgrounded the app and returned); a reconnection rate above 3 per session suggests the 5-second timeout is too short and the upstream flows are restarting on normal navigation flows
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The SharingStarted.WhileSubscribed(5_000) configuration — specifically the 5-second timeout that keeps the upstream active during rotation (which takes under 1 second) without leaving it active indefinitely in the background — is the production-calibrated detail that distinguishes engineers who have debugged coroutine lifecycle issues in production from those who know the API documentation. The RideUiState.Error with a retryAction = ::calculateFare lambda — designing the error state to include the retry action so the UI can retry without coupling to the ViewModel's method name — is the API design sophistication that makes the error handling pattern reusable across all error states. The repeatOnLifecycle vs. launchWhenStarted distinction with the specific explanation (launchWhenStarted suspends collection but keeps the upstream running and buffering) is the precise correctness argument.
What differentiates it from mid-level thinking: A mid-level Android developer would replace GlobalScope with viewModelScope (correct) and replace launchWhenStarted with repeatOnLifecycle (correct) but would not know about SharingStarted.WhileSubscribed(5_000) for rotation-resilient flows, would not know the combine operator for merging four streams into one state, and would not design the error state with a retry lambda. They would likely replace the Handler timer with a Timer (also leaks) rather than a coroutine ticker flow.
What would make it a 10/10: A 10/10 response would include a complete unit test for the combine-based uiState using kotlinx-coroutines-test's TestScope and MutableSharedFlow test doubles for each of the four input streams, showing how to advance virtual time and assert intermediate states during the combination, plus a turbine library example for the etaCountdown Flow test.
Question 7: Background Processing — WorkManager, Foreground Services, and Exact Alarms
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, WhatsApp, Spotify, Todoist, Duolingo
The Question
You are a Senior Android Developer at a productivity app company. The app has three background processing requirements: (1) A document sync feature that must upload any locally edited documents to the cloud within 15 minutes of the edit, but only when the device has a network connection — it must survive app restarts and reboots, handle sync failures with exponential backoff, and chain a compression step before the upload; (2) A podcast download feature that must show a persistent notification with download progress while downloading, allow the user to pause/resume/cancel the download, and download multiple podcasts simultaneously; (3) A "smart reminder" feature that sends a notification at an exact user-specified time (e.g., "Remind me at 9:00am Tuesday") — the user's trust in the app depends on the reminder arriving at exactly the specified time, even if the app is not running. Walk through the correct background processing mechanism for each requirement, the specific APIs involved, and the Android version constraints that affect your implementation choices.
1. What Is This Question Testing?
- WorkManager for deferrable background work — understanding that
WorkManageris the correct API for deferrable, guaranteed background work (document sync); knowing theWorkRequesttypes:OneTimeWorkRequest(run once, with optional retry policy) andPeriodicWorkRequest(run on a schedule with a minimum period of 15 minutes); knowing theConstraintsAPI:Constraints.Builder().setRequiredNetworkType(NetworkType.CONNECTED).build(); and knowing thatWorkManagerpersists work to a Room database, ensuring it survives process death and device reboots
ListenableWorkerchaining — knowing thatWorkManager.workContinuationchains multipleWorkRequests sequentially (the output data from the first Worker is passed as input to the second via theDataobject); knowing that.then(compressionWork).then(uploadWork)creates a serial chain; and knowing that.enqueue(listOf(compressionWork, uploadWork))without.then()runs them in parallel
- Foreground services for long-running user-visible work — knowing that Android 8.0+ requires long-running background work to run as a foreground service with a persistent notification; knowing that for file downloads,
WorkManager'ssetForeground(ForegroundInfo)method within aCoroutineWorkeris the recommended approach (it creates a foreground service managed by WorkManager rather than requiring the app to manage the service lifecycle manually); knowing theDownloadManagersystem service as an alternative for simple downloads that do not require custom progress UI
- Exact alarms and the
AlarmManagerAPI — knowing the three alarm types:setExact()(exact delivery time, Doze-mode compatible),setExactAndAllowWhileIdle()(exact delivery even during Doze — required for alarms that must fire regardless of battery saver), andsetAlarmClock()(the highest-priority alarm that appears in the device's system clock UI and wakes the device from Doze and Standby); knowing that Android 12 (API 31) introduced theSCHEDULE_EXACT_ALARMpermission that users must grant explicitly; knowing that Android 13 (API 33) added theUSE_EXACT_ALARMpermission for calendar/clock apps that do not require user approval but have stricter Play Store policy requirements
- BroadcastReceiver for alarm delivery — knowing that the
AlarmManagerdelivers the alarm by firing aBroadcastReceiver; knowing that theBroadcastReceiver.onReceive()method runs on the main thread with a 10-second execution limit; any work that takes longer than 10 seconds (like showing a rich notification with a database lookup) must be handed off to a coroutine withgoAsync()or aWorkRequest
- Doze mode and battery optimisation impact — knowing that Android's Doze mode (introduced in Android 6.0) defers network access and
JobScheduler/WorkManagerjobs during idle periods; knowing thatsetExactAndAllowWhileIdle()is the onlyAlarmManagermethod that is guaranteed to fire during Doze; and knowing that Android 13+ restricts apps from using exact alarms by default for battery efficiency reasons — apps must justify the exact alarm requirement in Play Store metadata
2. Framework: Background Processing Selection Model (BPSM)
- Assumption Documentation — For each background requirement, determine: can it tolerate delay? (WorkManager is appropriate for deferrable work with flexible timing), does it require user visibility while running? (Foreground service via WorkManager's
setForeground), does it require exact timing? (AlarmManager with the appropriate permission), and does it require network access? (WorkManager'sNetworkTypeconstraint)
- Constraint Analysis — Android 12+ requires the
SCHEDULE_EXACT_ALARMpermission for exact alarms; if the target SDK is 31+, the app must handle the case where the permission is not granted (prompt the user to grant it in system settings viaAlarmManager.canScheduleExactAlarms()); Android 14+ changes the foreground service type requirements (declareforegroundServiceTypein the manifest)
- Tradeoff Evaluation — For podcast downloads: WorkManager with
setForeground(recommended, WorkManager manages the foreground service lifecycle) vs. a manually managedService(more control, more complexity, more ways to introduce bugs); for most Android apps, WorkManager's managed foreground service is the correct choice; a manually managed Service is only necessary when WorkManager's scheduling model does not fit the use case
- Hidden Cost Identification —
WorkManager's minimum periodic work interval is 15 minutes (enforced by the OS — setting a shorter interval is silently clamped to 15 minutes); for the document sync requirement (sync within 15 minutes of edit), use aOneTimeWorkRequesttriggered by the document edit event, not aPeriodicWorkRequest
- Risk Signals / Early Warning Metrics — WorkManager task success/failure rate (WorkManager exposes
WorkInfo.State— monitor the ratio ofSUCCEEDEDtoFAILEDstates via the WorkManagerLiveDataAPI or Firebase Analytics custom events), exact alarm delivery accuracy in production (measure the delta between the user's requested reminder time and the actual notification delivery time via Firebase Performance custom traces; target under 30 seconds on modern Android versions)
- Pivot Triggers — If the exact alarm permission grant rate is below 40% (users are not granting
SCHEDULE_EXACT_ALARM): implement a fallback using inexact alarms (setAndAllowWhileIdle()with a ±5 minute window) with clear in-app communication to users that exact reminders require the permission; a reminder at ±5 minutes is better than no reminder
- Long-Term Evolution Plan — Test background work on a range of device manufacturers (Xiaomi, OnePlus, Samsung — all have custom battery optimisation that aggressively kills background processes beyond Android's standard Doze mode); use
adb shell dumpsys deviceidleand manufacturer-specific testing tools to confirm WorkManager tasks fire correctly on restricted devices
3. The Answer
Requirement 1: Document Sync — WorkManager with Chained Workers and Constraints
WorkManager is the correct API: the sync must survive app restarts (WorkManager persists to Room), requires network (Constraints), must retry on failure (exponential backoff), and is deferrable (a 15-minute window is acceptable). Chained workers for compress-then-upload:
kotlin
// Document edit trigger
fun onDocumentEdited(documentId: String) {
val compressionWork = OneTimeWorkRequestBuilder<DocumentCompressionWorker>()
.setInputData(workDataOf(KEY_DOCUMENT_ID to documentId))
.setConstraints(Constraints.Builder().build()) // No network needed for compression
.setBackoffCriteria(BackoffPolicy.EXPONENTIAL, 30, TimeUnit.SECONDS)
.build()
val uploadWork = OneTimeWorkRequestBuilder<DocumentUploadWorker>()
.setConstraints(
Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.build()
)
.setBackoffCriteria(BackoffPolicy.EXPONENTIAL, 60, TimeUnit.SECONDS)
.build()
WorkManager.getInstance(context)
.beginUniqueWork(
"sync_$documentId", // Unique name prevents duplicate sync jobs for the same document
ExistingWorkPolicy.REPLACE, // If already queued, replace with fresh job
compressionWork
)
.then(uploadWork) // Upload only runs after compression succeeds
.enqueue()
}
The DocumentCompressionWorker passes the compressed file path to the DocumentUploadWorker via Result.success(workDataOf(KEY_COMPRESSED_PATH to compressedPath)) — WorkManager's chaining automatically passes the output data as input to the next worker. The beginUniqueWork with ExistingWorkPolicy.REPLACE ensures that if the user edits the same document multiple times rapidly, only one sync job runs (the latest one).
Requirement 2: Podcast Downloads — WorkManager with setForeground
For downloads showing persistent progress notifications: use CoroutineWorker with setForeground. This avoids manually managing a Service lifecycle while still showing a persistent notification:
kotlin
class PodcastDownloadWorker(
context: Context,
workerParams: WorkerParameters
) : CoroutineWorker(context, workerParams) {
override suspend fun doWork(): Result {
val podcastId = inputData.getString(KEY_PODCAST_ID) ?: return Result.failure()
// Show foreground notification (creates foreground service automatically)
setForeground(createForegroundInfo(podcastId, progress = 0))
return try {
podcastRepository.downloadPodcast(podcastId) { progress ->
// Update notification progress (non-suspending, runs in the IO dispatcher)
setForeground(createForegroundInfo(podcastId, progress))
}
Result.success()
} catch (e: CancellationException) {
// Worker was cancelled (user paused/cancelled) — clean up partial download
podcastRepository.deletePartialDownload(podcastId)
Result.failure()
} catch (e: IOException) {
Result.retry() // Retry on network error
}
}
private fun createForegroundInfo(podcastId: String, progress: Int): ForegroundInfo {
val notification = NotificationCompat.Builder(applicationContext, DOWNLOAD_CHANNEL_ID)
.setContentTitle("Downloading podcast")
.setProgress(100, progress, false)
.addAction(R.drawable.ic_pause, "Pause", createPauseIntent(podcastId))
.addAction(R.drawable.ic_cancel, "Cancel", createCancelIntent(id)) // WorkManager work ID
.setSmallIcon(R.drawable.ic_download)
.setOngoing(true)
.build()
return ForegroundInfo(podcastId.hashCode(), notification)
}
}
Pause/resume: cancelling the WorkRequest (via WorkManager.cancelWorkById(workId)) cancels the coroutine, triggering the CancellationException handler. Resume is implemented by re-enqueueing the download work with ExistingWorkPolicy.KEEP — if the download is already in progress (not paused), the re-enqueue does nothing. Concurrent downloads: enqueue each podcast download as a separate OneTimeWorkRequest — WorkManager parallelises them automatically within its thread pool.
Requirement 3: Exact Reminders — AlarmManager with Permission Handling
For reminders that must fire at an exact user-specified time, WorkManager is insufficient (it does not guarantee exact timing). Use AlarmManager.setAlarmClock() — the highest-priority alarm type that appears in the system clock UI and wakes the device from Doze:
kotlin
fun scheduleReminder(reminderId: String, triggerAtMillis: Long, reminderText: String) {
val alarmManager = context.getSystemService(AlarmManager::class.java)
// Android 12+: check if exact alarms are permitted
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.S) {
if (!alarmManager.canScheduleExactAlarms()) {
// Navigate the user to system settings to grant SCHEDULE_EXACT_ALARM
val intent = Intent(Settings.ACTION_REQUEST_SCHEDULE_EXACT_ALARM)
context.startActivity(intent)
return
}
}
val intent = Intent(context, ReminderBroadcastReceiver::class.java).apply {
action = ACTION_REMINDER_FIRE
putExtra(KEY_REMINDER_ID, reminderId)
putExtra(KEY_REMINDER_TEXT, reminderText)
}
val pendingIntent = PendingIntent.getBroadcast(
context,
reminderId.hashCode(),
intent,
PendingIntent.FLAG_UPDATE_CURRENT or PendingIntent.FLAG_IMMUTABLE
)
// setAlarmClock fires during Doze, appears in system clock UI
alarmManager.setAlarmClock(
AlarmManager.AlarmClockInfo(triggerAtMillis, pendingIntent),
pendingIntent
)
}
The ReminderBroadcastReceiver: the receiver's onReceive() runs on the main thread with a 10-second time limit; for looking up reminder details from Room (which is a database query), use goAsync():
kotlin
class ReminderBroadcastReceiver : BroadcastReceiver() {
override fun onReceive(context: Context, intent: Intent) {
val pendingResult = goAsync() // Extends the 10-second limit while the coroutine runs
CoroutineScope(Dispatchers.IO).launch {
try {
val reminderId = intent.getStringExtra(KEY_REMINDER_ID) ?: return@launch
val reminder = reminderRepository.getReminder(reminderId)
NotificationHelper.showReminderNotification(context, reminder)
} finally {
pendingResult.finish() // Must be called to signal that onReceive is complete
}
}
}
}
Rescheduling after reboot: alarms are cleared when the device reboots; register a BOOT_COMPLETED BroadcastReceiver that queries all pending reminders from Room and re-schedules them with setAlarmClock(). Declare in AndroidManifest.xml:
xml
<uses-permission android:name="android.permission.RECEIVE_BOOT_COMPLETED" />
<receiver android:name=".BootCompletedReceiver" android:exported="true">
<intent-filter>
<action android:name="android.intent.action.BOOT_COMPLETED" />
</intent-filter>
</receiver>
Early Warning Metrics:
- WorkManager task failure rate per Worker class — monitor the
WorkInfo.State.FAILEDcount perWorkersubclass in Firebase Analytics; a failure rate above 5% for the upload worker indicates a server-side issue or network error handling bug; a failure rate above 1% for the compression worker indicates a disk space or I/O error
- Exact alarm permission grant rate — log the
alarmManager.canScheduleExactAlarms()result in Firebase Analytics segmented by Android API level and device manufacturer; a grant rate below 60% for API 31+ devices indicates the permission request UX is not compelling enough; below 30% is a product-level crisis for a reminder app
- Alarm delivery delta in production — log
System.currentTimeMillis() - triggerAtMillisin theBroadcastReceiver.onReceive()and send to Firebase Performance as a custom trace; median delivery delta should be under 5 seconds; P95 above 60 seconds indicates Doze mode interference and requires upgrading fromsetExact()tosetAlarmClock()
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The beginUniqueWork with ExistingWorkPolicy.REPLACE for the document sync — preventing duplicate sync jobs for the same document when the user edits rapidly — is the production correctness detail that prevents the "5 queued syncs for the same document" bug that appears after the first implementation. The goAsync() pattern in the BroadcastReceiver with the try/finally to call pendingResult.finish() is the correct pattern for performing async work in a receiver without causing an ANR — and knowing that finally is required (not just the happy path) shows understanding of the 10-second execution limit. The distinction between setExact(), setExactAndAllowWhileIdle(), and setAlarmClock() — with the specific explanation that setAlarmClock() appears in the system clock UI — is the alarm API depth that determines whether a reminder app is reliable or not.
What differentiates it from mid-level thinking: A mid-level Android developer would use setExact() for the reminder (insufficient during Doze mode, reminders will be delayed), would not know about goAsync() for the BroadcastReceiver, would not chain WorkManager workers (beginWith().then()) and would instead put both compression and upload in a single Worker, and would not know about BOOT_COMPLETED for rescheduling alarms after reboot. They would also not know about the SCHEDULE_EXACT_ALARM permission introduced in Android 12.
What would make it a 10/10: A 10/10 response would include a WorkManager integration test using TestDriver (the WorkManager test API) to verify that the compress-then-upload chain executes correctly and that the unique work policy prevents duplicates, a full AndroidManifest.xml configuration for the foreground service with foregroundServiceType="dataSync" (required for Android 14+ foreground service type declarations), and a ManufacturerBatteryOptimizationHelper utility showing how to detect Xiaomi/OnePlus battery optimisation and direct the user to the correct system settings screen.
Question 8: CI/CD and Build Configuration — Gradle, Build Variants, and Automated Deployment
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Spotify, Shopify, Duolingo, Revolut
The Question
You are a Senior Android Developer responsible for the build and release engineering at a 15-engineer Android team. The current state: build configuration is in Groovy DSL (not Kotlin DSL), all build-time configuration (API keys, feature flags, base URLs) is hardcoded in the BuildConfig class and committed to the repository, the CI pipeline runs the full 35-minute test suite on every commit to every branch regardless of what changed, there is no automated Play Store deployment (releases are manually uploaded by one engineer), and the app has no product flavours — the same binary is submitted for both the free and the premium tier. You have been asked to modernise the build system. Walk through the migration to Kotlin DSL, the secure handling of build-time secrets, the CI/CD pipeline optimisation strategy, the product flavour setup for the free/premium split, and the automated Play Store deployment via Fastlane or the Play Developer API.
1. What Is This Question Testing?
- Kotlin DSL for Gradle — knowing the differences between Groovy DSL (
build.gradle) and Kotlin DSL (build.gradle.kts): Kotlin DSL provides IDE autocompletion for build configuration, compile-time type checking (configuration errors are caught before the build runs, not during), and is the direction Android Studio and Gradle are moving; knowing the migration approach (rename files, convert closures to lambdas, convertdeftoval, convert string property access to typed property access); and knowing thebuildSrcorconvention pluginsapproach for sharing build logic across modules
- Secure build-time secret management — understanding that API keys and secrets in
BuildConfigfields committed to the repository violate the principle of least privilege; knowing the correct approach: read secrets from environment variables in thebuild.gradle.kts(System.getenv("API_KEY") ?: "") which are injected by the CI system (GitHub Actions secrets, Jenkins credentials); locally, developers use alocal.propertiesfile (gitignored) that stores the secrets for development builds; thelocal.propertiesfile pattern is also used for release signing credentials
- Build variants and product flavours — knowing the distinction:
buildTypes(debug/release — affect the build's debugging configuration, signing, ProGuard),productFlavors(free/premium — affect what code and resources are included in the build); knowing that combining a flavour with a build type creates aBuildVariant(e.g.,freeDebug,freeRelease,premiumDebug,premiumRelease); knowing how to use flavour-specific source sets (src/free/java/,src/premium/java/) to include different implementations per flavour
- CI pipeline optimisation — knowing the strategies for reducing CI time: path-based test triggering (only run tests for modules that changed), parallelising independent test modules across CI agents, caching the Gradle build cache and dependency downloads between runs, and using emulator snapshots to reduce emulator boot time; knowing that a 35-minute full test suite on every commit to every branch is wasteful — a tiered approach (fast unit tests on every commit, full integration and E2E tests on merge to main) reduces feedback latency
- Automated Play Store deployment — knowing the two approaches: Fastlane's
supplyaction (open-source Ruby gem that wraps the Google Play Developer API, commonly used for metadata + binary upload), and the Google Play Developer API directly (REST API, requires service account credentials); knowing the deployment tracks:internal(immediate distribution to internal testers),alpha(opt-in alpha testing),beta(opt-in beta), andproduction(full rollout with optional percentage rollout); knowing theversionCodeauto-increment requirement (Play Store rejects uploads with aversionCodethat is not greater than the current production version)
- Version code management — knowing the common automated
versionCodestrategies: using the CI build number (System.getenv("CI_BUILD_NUMBER")?.toInt() ?: 1inbuild.gradle.kts), using a timestamp, or usinggit rev-list --count HEAD(the number of commits since the repo was created); knowing that theversionCodemust be a monotonically increasing integer and that Git commit count is the most reproducible strategy
2. Framework: Build and Release Engineering Modernisation Model (BREMM)
- Assumption Documentation — Count the number of
build.gradlefiles in the project (one per module) to understand the scope of the Kotlin DSL migration; identify which modules share build configuration logic (candidates for convention plugins); confirm the CI platform (GitHub Actions, GitLab CI, Bitrise, CircleCI) as the secret injection mechanism differs per platform
- Constraint Analysis — The Kotlin DSL migration can be done module-by-module without breaking the build; the secret migration must be coordinated with the CI team to ensure environment variables are set before the secrets are removed from the codebase; the automated deployment must be tested on the
internaltrack before being enabled forproduction
- Tradeoff Evaluation —
buildSrcvs. convention plugins (Gradle composite build) for shared build logic:buildSrcis simpler (a single directory at the project root containing Kotlin files that compile to Gradle plugins) but invalidates the entire project's build cache when changed; convention plugins usingincludeBuild("build-logic")insettings.gradle.ktsare more complex but allow incremental caching; for a 15-module project, convention plugins are worth the additional complexity
- Hidden Cost Identification — The
versionCodeauto-increment from Git commit count has a gotcha: if the repository is re-created or the initial commit is amended, the commit count changes, potentially producing a lowerversionCodethan the current production version; use the CI build number as a secondary safety mechanism
- Risk Signals / Early Warning Metrics — CI build success rate (a Kotlin DSL migration that introduces a syntax error in a build file fails silently until the next CI run — the migration must be done with local build verification at each step), Play Store upload rejection rate (the Play Store API returns specific rejection codes; a
VERSION_CODE_NOT_INCREASINGerror in CI indicates the versionCode strategy is failing), build time trend (should decrease after CI optimisations — alert if any optimisation increases the median build time)
- Pivot Triggers — If the path-based test triggering causes tests to be skipped for a shared
:coremodule change that should trigger all dependent module tests: the path change detection is too granular; fall back to "any change to a:coremodule triggers the full test suite" until the dependency graph analysis is refined
- Long-Term Evolution Plan — Quarter 1: Kotlin DSL migration + secret management; Quarter 2: CI pipeline optimisation + product flavours; Quarter 3: automated Play Store deployment + release signing in CI; Quarter 4: Gradle build scan integration (Develocity) for build performance analytics
3. The Answer
Step 1: Kotlin DSL Migration
Migrate module by module, starting with the :app module (the most critical, fixes first validate that the approach works). The migration is mostly mechanical: rename build.gradle to build.gradle.kts, convert Groovy closures to Kotlin lambdas, convert string-based method calls to typed Kotlin calls. Key conversions:
kotlin
// BEFORE (Groovy)
android {
compileSdkVersion 34
defaultConfig {
applicationId "com.example.app"
minSdkVersion 24
targetSdkVersion 34
versionCode 1
versionName "1.0"
}
}
// AFTER (Kotlin DSL)
android {
compileSdk = 34
defaultConfig {
applicationId = "com.example.app"
minSdk = 24
targetSdk = 34
versionCode = System.getenv("CI_BUILD_NUMBER")?.toInt() ?: 1
versionName = "1.0"
}
}
Shared build logic with a convention plugin: create build-logic/convention/src/main/kotlin/AndroidLibraryConventionPlugin.kt:
kotlin
class AndroidLibraryConventionPlugin : Plugin<Project> {
override fun apply(target: Project) {
with(target) {
with(pluginManager) {
apply("com.android.library")
apply("org.jetbrains.kotlin.android")
}
extensions.configure<LibraryExtension> {
compileSdk = 34
defaultConfig {
minSdk = 24
testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
}
}
}
}
}
Every library module then uses id("convention.android-library") in its plugins {} block instead of repeating the 20-line Android library configuration. A change to the convention plugin triggers rebuilds only for modules that use it (if build-logic is an included build — not buildSrc).
Step 2: Secure Secret Management
Remove all hardcoded API keys from BuildConfig. The correct pattern uses environment variables in CI and local.properties for local development:
kotlin
// In build.gradle.kts for :app
android {
defaultConfig {
// Read from environment variable (CI) or local.properties (local dev)
val apiKey = System.getenv("API_KEY")
?: (project.rootProject.file("local.properties")
.takeIf { it.exists() }
?.let { java.util.Properties().apply { load(it.inputStream()) } }
?.getProperty("API_KEY"))
?: throw GradleException("API_KEY not set. Add to local.properties or CI environment.")
buildConfigField("String", "API_KEY", "\"$apiKey\"")
}
}
The local.properties file pattern: create local.properties at the project root (add it to .gitignore), add API_KEY=dev_key_value_here. Each developer sets up their own local.properties after cloning the repo. In CI (GitHub Actions), set API_KEY as a repository secret and inject it as an environment variable in the workflow YAML: env: API_KEY: ${{ secrets.API_KEY }}.
Step 3: Product Flavours for Free vs. Premium
kotlin
android {
flavorDimensions += "tier"
productFlavors {
create("free") {
dimension = "tier"
applicationIdSuffix = ".free"
versionNameSuffix = "-free"
buildConfigField("Boolean", "IS_PREMIUM", "false")
}
create("premium") {
dimension = "tier"
applicationIdSuffix = "" // Production app ID
buildConfigField("Boolean", "IS_PREMIUM", "true")
}
}
}
Flavour-specific source sets: create src/free/kotlin/ and src/premium/kotlin/ for code that differs between flavours. The FeatureGating.kt class can have two implementations:
src/free/kotlin/com/example/FeatureGating.kt → returns false for all premium features
src/premium/kotlin/com/example/FeatureGating.kt → returns true for all premium features
Gradle automatically includes the correct implementation based on the active flavour — no if (BuildConfig.IS_PREMIUM) checks in shared code required. This is cleaner than the BuildConfig.IS_PREMIUM boolean flag because the compiler tree-shakes the unused implementation in each variant.
Step 4: CI Pipeline Optimisation
Tier the CI pipeline to provide fast feedback on every commit and full confidence on merge. Configure GitHub Actions with 3 jobs: Job 1 — Unit tests (every commit, target under 4 minutes): runs ./gradlew testDebugUnitTest with only the changed modules detected via a path filter. Job 2 — Integration tests + build (every PR merge to develop branch, target under 12 minutes): runs on a GitHub Actions emulator with ./gradlew connectedDebugAndroidTest parallelised across 2 emulator instances using the Android Emulator Runner action with --num-instances 2. Job 3 — Release build + E2E tests + Play Store upload (every merge to main, target under 25 minutes): full release build for both flavours, E2E smoke tests on a Firebase Test Lab device matrix, then automatic upload to the internal Play Store track. Gradle build cache: add to gradle.properties:
properties
org.gradle.caching=true
org.gradle.configuration-cache=true
org.gradle.parallel=true
org.gradle.daemon=false # Disable daemon in CI — saves memory
Step 5: Automated Play Store Deployment via Fastlane
ruby
# fastlane/Fastfile
lane :deploy_internal do
gradle(
task: "bundle",
flavor: "premium",
build_type: "Release"
)
upload_to_play_store(
track: "internal",
aab: "app/build/outputs/bundle/premiumRelease/app-premium-release.aab",
json_key_data: ENV["PLAY_STORE_JSON_KEY"], # Service account JSON from CI secrets
skip_upload_metadata: false,
metadata_path: "fastlane/metadata/android"
)
end
lane :promote_to_production do |options|
upload_to_play_store(
track: "internal",
track_promote_to: "production",
rollout: options[:rollout] || "0.1" # Default to 10% staged rollout
)
end
The PLAY_STORE_JSON_KEY is a GitHub Actions secret containing the service account JSON key from the Google Play Console API credentials. The deploy_internal lane runs automatically on every merge to main; the promote_to_production lane is triggered manually (after QA validation of the internal build) via a GitHub Actions workflow dispatch event.
Early Warning Metrics:
- CI build success rate per job tier — a failure rate above 5% for the unit test tier indicates flaky tests or non-deterministic test data; address before the CI becomes distrusted by the team
- Play Store upload latency — the time from CI build completion to the release appearing in the internal track; the Play Store API processing typically takes 5–20 minutes; alert if the upload step itself takes over 5 minutes (indicating API rate limiting or credential issues)
versionCodemonotonicity check — add a CI step that calls the Play Store API to get the current productionversionCodeand asserts that the computed versionCode for the current build is greater; fail the CI build with a clear error message if not
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The convention plugin approach (not buildSrc) for shared build logic — with the specific explanation that buildSrc invalidates the entire project's build cache on change while includeBuild convention plugins allow incremental caching — is the build engineering depth that matters for a 15-module project. The flavour-specific source sets (src/free/kotlin/FeatureGating.kt vs. src/premium/kotlin/FeatureGating.kt) for clean flavour separation without BuildConfig.IS_PREMIUM boolean checks in shared code shows architectural thinking about build variants. The Fastlane promote_to_production lane with a default 10% staged rollout (rollout: "0.1") — rather than an immediate 100% rollout — demonstrates the release engineering safety discipline that prevents a bad release from reaching all users simultaneously.
What differentiates it from mid-level thinking: A mid-level Android developer would know about product flavours and Fastlane but would not know about convention plugins vs. buildSrc and their caching implications, would use BuildConfig.IS_PREMIUM flags in shared code rather than flavour-specific source sets, would not know about the versionCode monotonicity check CI step, and would not design the three-tier CI pipeline with path-based triggering.
What would make it a 10/10: A 10/10 response would include the complete settings.gradle.kts with the includeBuild("build-logic") declaration, a GitHub Actions YAML workflow for the tiered CI pipeline showing the path filter configuration and the emulator parallelisation, and a release_signing.gradle.kts snippet showing how to read the keystore password from local.properties locally and from a CI secret in production without committing the keystore to the repository.
Question 9: Accessibility — Building an Inclusive Android App for Users with Disabilities
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Microsoft, BBC, Gov.uk apps, NHS apps
The Question
You are a Senior Android Developer at a news app company. An accessibility audit commissioned before a UK government contract renewal has found that the app fails multiple WCAG 2.1 AA criteria and cannot be used effectively with TalkBack, the Android screen reader used by visually impaired users. The audit findings: (1) The news article cards have no semantic content descriptions — TalkBack reads out "Image, Button, Button, Text" rather than meaningful content; (2) A custom circular progress indicator used for loading states is not accessible — TalkBack ignores it entirely and screen reader users do not know the app is loading; (3) The article reading view uses a custom ViewGroup that implements swipe-to-bookmark but intercepts all touch events, blocking TalkBack's touch exploration gestures; (4) The font size used throughout the app ignores the user's system font size preferences — users who have set large font sizes in Android accessibility settings see no change in the app. Walk through the specific code fixes for each finding and the accessibility practices the development team should adopt as permanent standards.
1. What Is This Question Testing?
- Content descriptions and semantic structure — knowing that
android:contentDescription(View system) andModifier.semantics { contentDescription = "..." }(Compose) provide the text that TalkBack reads aloud; knowing that complex compound views (a card containing an image, headline, byline, and two action buttons) should be grouped withModifier.semantics(mergeDescendants = true) {}so TalkBack treats the entire card as one focusable element; knowing that a meaningful content description for a news card should describe its purpose and content (not just its type): "Breaking: UK election results 2024 article, by John Smith, published 2 hours ago, double-tap to read, swipe right for more actions"
- Custom view accessibility — knowing that the
AccessibilityNodeInfoCompatAPI allows custom views to expose accessibility information to TalkBack; knowing thatView.setAccessibilityDelegate(object : AccessibilityDelegateCompat() { override fun onInitializeAccessibilityNodeInfo(host: View, info: AccessibilityNodeInfoCompat) { super.onInitializeAccessibilityNodeInfo(host, info); info.className = "android.widget.ProgressBar" } })tells TalkBack that the custom view should be treated as a standardProgressBar, which causes TalkBack to announce it as "Loading"
- Touch exploration and custom gestures — knowing that a custom
ViewGroupthat overridesonTouchEvent()and returnstruefor all touch events blocks TalkBack's touch exploration mode (which uses single-finger drag to explore the screen); knowing the fix: overridedispatchHoverEvent()to pass hover events (which TalkBack uses for touch exploration) to theAccessibilityHelperwhile keeping the swipe touch gesture handling inonTouchEvent(); alternatively, expose the bookmark action viaViewCompat.addAccessibilityAction()as a custom TalkBack action ("Double-tap with two fingers to bookmark")
- Font scaling and
spunits — knowing that Android text sizes should be specified insp(scale-independent pixels) notdp(density-independent pixels);spunits automatically scale with the user's system font size preference (set in Android Settings → Accessibility → Display size and text); knowing that hardcoded pixel values (px) ordpvalues for text sizes prevent the system font scale from applying; and knowing the Compose equivalent:TextUnitwithspunits via thespextension (16.spscales with system font size,16.dpdoes not)
- Accessibility testing tools — knowing the tools available for accessibility testing: TalkBack on a physical device (the most comprehensive test), Accessibility Scanner (Google's automated accessibility scanner app that overlays accessibility issues on the screen),
AccessibilityChecks.enable()(the Espresso accessibility checks that run automatically during UI tests — catch obvious issues in CI without a human running TalkBack), and the Compose accessibility testing APIs (composeTestRule.onNodeWithText("...").assertIsDisplayed())
- WCAG 2.1 AA criteria for mobile — knowing the key mobile-relevant WCAG criteria: 1.1.1 Non-text Content (images must have text alternatives), 1.3.1 Info and Relationships (semantic structure must be programmatically determinable), 1.4.4 Resize Text (text must be resizable up to 200% without loss of content), 2.1.1 Keyboard (all functionality must be operable without a pointer — in Android, "keyboard" means TalkBack's D-pad navigation), 2.4.3 Focus Order (TalkBack focus must move in a logical order that preserves meaning and operability)
2. Framework: Android Accessibility Remediation Model (AARM)
- Assumption Documentation — Establish the scope: is the app built with Views (XML + Fragment) or Jetpack Compose, or a mixture? The accessibility APIs differ between the two systems (though both ultimately write to the same
AccessibilityNodeInfotree); confirm which Android versions the app must support (TalkBack behaviour has changed significantly across Android versions — testing on Android 9, 12, and 14 covers the major TalkBack behaviour changes)
- Constraint Analysis — WCAG 2.1 AA compliance for the UK government contract is a legal requirement (UK Equality Act 2010, the Public Sector Bodies Accessibility Regulations 2018); the audit findings must be remediated within the contract renewal deadline; a secondary constraint is that accessibility fixes must not break the visual design for sighted users
- Tradeoff Evaluation — Fix 3 (touch event interception) has two approaches: the
dispatchHoverEventdelegation approach (preserves the existing swipe gesture implementation, complex to implement correctly) vs. replacing the swipe gesture with explicit action buttons visible to sighted users (simpler, more discoverable for all users — the "curb-cut effect"); for a news app where swiping to bookmark is a differentiating UX feature, the delegation approach preserves the gesture while adding the TalkBack action
- Hidden Cost Identification — The content description fix for article cards requires understanding the current data model — the content description must include the article title, author, publication time, and whether the article is bookmarked; if this data is not available at the point where the card's
contentDescriptionis set (because it is loaded asynchronously), the content description must be updated when the data arrives
- Risk Signals / Early Warning Metrics — Automated accessibility check failure rate in CI (using Espresso
AccessibilityChecks.enable()— run on every PR; alert if any new check failures are introduced), TalkBack crash rate in Firebase Crashlytics (crashes that only occur in TalkBack mode are often caused by focus management bugs), App Store rating filtering for accessibility (Play Store allows filtering reviews by mention of specific terms — filter for "TalkBack," "accessibility," "screen reader" monthly)
- Pivot Triggers — If the
AccessibilityChecksCI gate is catching more than 20 new failures per sprint (indicating that the team is still introducing new accessibility issues faster than fixing old ones): run an accessibility sprint where the entire team uses TalkBack for one full workday — this creates immediate empathy for users with visual impairments and is more effective than any number of tooling requirements
- Long-Term Evolution Plan — Sprint 1–2: fix the 4 audit findings; Sprint 3–4: add
AccessibilityChecks.enable()to the UI test suite; Sprint 5–6: conduct a full TalkBack walkthrough of all major user flows; Quarterly: repeat the TalkBack walkthrough and review automated accessibility reports
3. The Answer
Fix 1: News Article Card Content Descriptions (Compose)
The problem: a Compose ArticleCard Composable contains an Image, a headline Text, a byline Text, and two IconButtons (share and bookmark). TalkBack focuses on each element separately and reads meaningless descriptions ("Image," "Button," "Button"). The fix: use Modifier.semantics(mergeDescendants = true) to merge the card into one focusable element, and provide a meaningful merged content description:
kotlin
@Composable
fun ArticleCard(article: Article, onRead: () -> Unit, onShare: () -> Unit, onBookmark: () -> Unit) {
val bookmarkStatus = if (article.isBookmarked) "bookmarked" else "not bookmarked"
val timeAgo = article.publishedAt.toTimeAgoString() // "2 hours ago"
Card(
modifier = Modifier
.semantics(mergeDescendants = true) {
contentDescription = "${article.title}, by ${article.author}, $timeAgo, $bookmarkStatus"
onClick(label = "Read article") { onRead(); true }
}
.clickable(onClickLabel = "Read ${article.title}") { onRead() }
) {
// ... ArticleImage, HeadlineText, BylineText ...
// Share and bookmark buttons get custom action labels, not separate focus items
IconButton(onClick = onShare, modifier = Modifier.semantics {
contentDescription = "Share ${article.title}"
}) { /* share icon */ }
IconButton(onClick = onBookmark, modifier = Modifier.semantics {
contentDescription = if (article.isBookmarked) "Remove bookmark" else "Bookmark ${article.title}"
}) { /* bookmark icon */ }
}
}
With mergeDescendants = true, TalkBack treats the entire card as one focusable item. The merged contentDescription gives TalkBack a meaningful summary to announce. The share and bookmark buttons still have individual semantic labels for when the user navigates to them specifically via TalkBack's item-by-item navigation.
Fix 2: Custom Progress Indicator (View System)
The problem: a custom CircularProgressView extending View is invisible to TalkBack because it does not set any accessibility properties. Fix: set the accessibility delegate to announce the loading state:
kotlin
class CircularProgressView @JvmOverloads constructor(
context: Context, attrs: AttributeSet? = null
) : View(context, attrs) {
init {
ViewCompat.setAccessibilityDelegate(this, object : AccessibilityDelegateCompat() {
override fun onInitializeAccessibilityNodeInfo(
host: View,
info: AccessibilityNodeInfoCompat
) {
super.onInitializeAccessibilityNodeInfo(host, info)
info.className = ProgressBar::class.java.name // TalkBack recognises ProgressBar
info.contentDescription = context.getString(R.string.loading_content)
// "Loading, please wait" — announced by TalkBack when the view is focused
}
})
importantForAccessibility = IMPORTANT_FOR_ACCESSIBILITY_YES // Force TalkBack to include it
}
}
For the Compose equivalent, use Modifier.semantics { stateDescription = "Loading" } on the loading indicator Composable — stateDescription is the Compose semantic property that TalkBack announces as a state change.
Fix 3: Custom Swipe Gesture Blocking Touch Exploration
The problem: the article reading view's custom ArticleViewGroup handles swipe gestures to bookmark and archive articles. Its onTouchEvent() returns true for all touch events, blocking TalkBack's touch exploration (which relies on hover events, not touch events). The fix: separate touch events (for sighted users' swipe gestures) from hover events (TalkBack touch exploration):
kotlin
class ArticleViewGroup @JvmOverloads constructor(
context: Context, attrs: AttributeSet? = null
) : ViewGroup(context, attrs) {
private val accessibilityHelper = object : ExploreByTouchHelper(this) {
override fun getVirtualViewAt(x: Float, y: Float) = HOST_ID // The whole view is one virtual view
override fun getVisibleVirtualViews(virtualViewIds: MutableList<Int>) {
virtualViewIds.add(HOST_ID)
}
override fun onPopulateNodeForVirtualView(virtualViewId: Int, node: AccessibilityNodeInfoCompat) {
node.contentDescription = currentArticle.title
// Add TalkBack custom actions as replacements for the swipe gestures
node.addAction(AccessibilityNodeInfoCompat.AccessibilityActionCompat(
AccessibilityNodeInfoCompat.ACTION_CUSTOM,
context.getString(R.string.action_bookmark)
))
}
override fun onPerformActionForVirtualView(virtualViewId: Int, action: Int, args: Bundle?): Boolean {
if (action == R.id.action_bookmark_id) {
bookmarkCurrentArticle()
return true
}
return false
}
}
init {
ViewCompat.setAccessibilityDelegate(this, accessibilityHelper)
}
// Touch events: handled by swipe detector for sighted users
override fun onTouchEvent(event: MotionEvent): Boolean {
swipeGestureDetector.onTouchEvent(event)
return true
}
// Hover events: dispatched to ExploreByTouchHelper for TalkBack users
override fun dispatchHoverEvent(event: MotionEvent): Boolean {
return accessibilityHelper.dispatchHoverEvent(event) || super.dispatchHoverEvent(event)
}
}
The dispatchHoverEvent() override passes hover events to the ExploreByTouchHelper (TalkBack's touch exploration mechanism), while onTouchEvent() continues to handle sighted users' swipe gestures. TalkBack users can now explore the article view with touch and perform the bookmark/archive actions via TalkBack custom actions ("Double-tap with two fingers").
Fix 4: Font Scaling (Compose and View System)
The problem: the app uses dp values for text sizes, which do not scale with the system font size. The fix: replace all dp text size values with sp.
kotlin
// BEFORE (broken — does not scale with system font)
Text(text = article.title, fontSize = 18.dp.value.sp) // Wrong — dp converted to sp manually
// WRONG in XML: android:textSize="18dp"
// AFTER (correct — scales with system font)
Text(text = article.title, fontSize = 18.sp) // Correct Compose usage
// CORRECT in XML: android:textSize="18sp"
An automated fix for the View system: add a custom lint rule (TextSizeDpUsage) that flags any android:textSize attribute with a dp suffix in XML layouts:
kotlin
class TextSizeDpLintRule : LayoutDetector() {
override fun getApplicableAttributes() = listOf("textSize")
override fun visitAttribute(context: XmlContext, attr: Attr) {
if (attr.value.endsWith("dp")) {
context.report(ISSUE, attr, context.getValueLocation(attr),
"Text sizes should use `sp` not `dp` to support user font size preferences")
}
}
}
A Compose equivalent lint rule can detect fontSize = X.dp patterns. Run both lint rules in CI — any PR that introduces a dp-unit text size fails the build.
Permanent Accessibility Standards for the Development Team
Three practices to adopt permanently: (1) AccessibilityChecks.enable() in all Espresso tests — add AccessibilityChecks.enable() to the @Before setup of every instrumented test; this automatically runs accessibility checks on every screen state tested, catching content description and contrast ratio issues in CI without a human running TalkBack. (2) TalkBack walkthrough in the Definition of Done — every new screen must be navigated with TalkBack enabled before the PR is merged; this is a 5-minute addition to the PR checklist that catches the most obvious TalkBack failures; add a screenshot or screen recording of the TalkBack walkthrough to the PR description. (3) Semantic Composable wrappers for reusable patterns — create AccessibleCard, AccessibleIconButton, and AccessibleImage Composable wrappers in the :core:design module that enforce the correct semantic structure; when developers use these wrappers instead of the base Card, IconButton, and Image Composables, they get correct accessibility behaviour by default rather than having to remember to add semantics manually.
Early Warning Metrics:
- Automated
AccessibilityChecksfailure count per PR — zero tolerance policy on new accessibility check failures introduced in new code; existing failures are tracked as a declining backlog; a PR that introduces even one new failure is blocked until fixed
- TalkBack navigation success rate in the monthly manual audit — a QA analyst navigates the 5 core user flows (sign in, read article, search, bookmark, share) using TalkBack and rates each step as Pass/Fail; target 100% pass rate for the core flows; failures are treated with P1 priority
- Font scale crash rate in Firebase Crashlytics — filter crashes for devices where
Configuration.fontScale> 1.3 (users who have set large font sizes); these crashes are oftenConstraintLayoutor text truncation crashes that only appear at large font scales; target zero font-scale-specific crashes
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The dispatchHoverEvent() + ExploreByTouchHelper pattern for the custom swipe gesture view — specifically the insight that TalkBack uses hover events for touch exploration while sighted users' swipe gestures use touch events, and that these two event streams can be intercepted separately — is the accessibility engineering depth that comes from actually debugging TalkBack issues on custom gesture views rather than reading documentation. The AccessibilityNodeInfoCompat.className = ProgressBar::class.java.name trick for making the custom progress indicator recognisable to TalkBack shows knowledge of how TalkBack's semantic tree works at a low level. The custom Lint rule for detecting dp-unit text sizes in CI is the tooling automation that ensures the fix persists across the team's future work.
What differentiates it from mid-level thinking: A mid-level Android developer would add contentDescription to the image (the most obvious fix) and would know about sp vs. dp for text sizes, but would not know about mergeDescendants = true for grouping compound views, would not know about dispatchHoverEvent() for custom touch gesture views, and would not know about AccessibilityDelegateCompat for custom views. They would not design the AccessibleCard wrapper Composable or the custom Lint rule for CI enforcement.
What would make it a 10/10: A 10/10 response would include the complete AccessibleArticleCard Composable implementation that is added to the :core:design system as the enforced reusable pattern, a complete Espresso AccessibilityChecks setup in the BaseInstrumentedTest class showing the enable() call and the global accessibility check configuration, and a WCAG 2.1 AA compliance checklist mapped to each of the 4 fixes showing which specific success criterion each fix addresses.
Question 10: Kotlin Multiplatform Mobile — Sharing Business Logic Between Android and iOS
Difficulty: Elite | Role: Android Developer | Level: Senior / Staff | Company Examples: JetBrains, Netflix, Touchlab, VMware, Philips
The Question
You are a Senior Android Developer at a company with native Android and iOS apps that have separate codebases maintained by two separate teams. The two apps have diverged significantly: they implement different business logic for the same features, the same bugs exist independently on both platforms, and the two teams spend significant time duplicating work. The CTO has approved a Kotlin Multiplatform Mobile (KMP) migration to share business logic between Android and iOS while keeping native UIs. You have been asked to design the KMP architecture — which layers to share, how to handle platform-specific differences (like networking and local storage), the interoperability with Swift on iOS, and the migration strategy that does not disrupt either team. Walk through the architecture, the specific Kotlin Multiplatform APIs and libraries, and the organisational considerations for a two-team migration.
1. What Is This Question Testing?
- KMP architecture decisions — understanding which layers are appropriate to share in KMP: the domain layer (use cases — pure Kotlin, no platform dependencies, the easiest to share) and the data layer (repositories, data models, API clients); and which layers should remain platform-specific: the presentation layer (Android ViewModel / iOS ViewModel equivalent — though
KMMViewModelfrom Touchlab enables sharing) and the UI layer (Compose on Android, SwiftUI on iOS); knowing that a common mistake is trying to share too much — the UI and platform-specific system integrations (push notifications, biometrics, camera) are best left native
expect/actualmechanism — understanding KMP's primary mechanism for platform-specific implementations:expectdeclarations in thecommonMainsource set define the interface that must be implemented,actualdeclarations inandroidMainandiosMainprovide the platform-specific implementations; knowing when to useexpect/actualvs. dependency injection:expect/actualis for platform capabilities that cannot be abstracted (likekotlinx.datetime.Clock.Systemvs. the platform's clock), while DI is for business logic abstractions (repositories that have different implementations per platform)
- KMP networking and storage libraries — knowing the KMP-compatible library ecosystem:
Ktor(JetBrains' HTTP client, works on Android using OkHttp and on iOS using Darwin's URLSession under the hood),SQLDelight(the KMP-compatible database library that generates type-safe Kotlin APIs from SQL, runs on Android using SQLite and on iOS using SQLite), andkotlinx.coroutines(the KMP coroutines library that providesFlowon both platforms); and knowing thatRetrofitandRoomare Android-only (not KMP-compatible)
- Swift/Kotlin interoperability — knowing that KMP compiles the shared module to an Objective-C framework (not Swift directly) via Kotlin/Native; knowing the limitations: Kotlin
suspend funis exposed to Swift as completion handler callbacks (async/await interop improved in Kotlin 1.9 with the experimental@Throwsannotation for structured Swift concurrency integration); knowing thatFlowdoes not natively work in Swift (it must be wrapped with aStateFlow→CombineWrapperor using Touchlab'sSKIE(Swift-Kotlin Interface Enhancer) plugin that generates idiomatic Swift code from Kotlin APIs
- The
iosMainandandroidMainsource sets — knowing the KMP source set hierarchy:commonMain(shared code),androidMain(Android-specificactualimplementations and Android-only classes),iosMain(iOS-specificactualimplementations — compiled with Kotlin/Native), andiosTest/androidTest(platform-specific tests); knowing that tests for the shared business logic incommonMaingo incommonTestand run on both platforms
- Migration strategy — understanding that migrating an existing Android codebase to KMP is a gradual process, not a rewrite; the approach: identify the domain layer use cases that both platforms implement identically, extract them to a
sharedKMP module, update the Android app to use the shared module (minimal change — it already uses Kotlin), and write the iOS bridge to consume the shared module from Swift (the larger engineering effort)
2. Framework: Kotlin Multiplatform Mobile Architecture Model (KMPMAM)
- Assumption Documentation — Inventory the code that is duplicated between the Android and iOS codebases: list every API endpoint, every business rule, every data validation, and every local storage schema that both platforms implement; the size of this inventory determines the ROI of the KMP migration; if 30% of the Android codebase is pure business logic that iOS duplicates, KMP provides significant value; if both apps are 90% UI code, the value is lower
- Constraint Analysis — The iOS team may have limited or no Kotlin experience; the initial KMP module must be consumable from Swift with minimal iOS-side boilerplate; the Android team does not want to change its
Coroutines + Flowparadigm for the sake of iOS compatibility;SKIEaddresses both constraints by automatically generating Swift-idiomatic wrappers for the Kotlin APIs
- Tradeoff Evaluation — Share only the domain layer (safest, smallest migration, lowest iOS impact) vs. share domain + data layer (higher value, requires replacing Retrofit with Ktor and Room with SQLDelight on Android — a significant migration) vs. share domain + data + presentation (maximum sharing, requires
KMMViewModel, most complex); the recommended starting point is domain-only sharing to prove the concept and build team confidence before expanding
- Hidden Cost Identification — Kotlin/Native compilation time: the
iosArm64andiosSimulatorArm64targets compile significantly slower than the JVM target; on a large shared module, iOS builds can add 5–15 minutes to the iOS CI pipeline; use Kotlin/Native's incremental compilation and build caching to mitigate this
- Risk Signals / Early Warning Metrics — Shared module test coverage (the shared domain layer must be tested in
commonTest— tests run on both JVM and the Kotlin/Native iOS target; a coverage drop in the shared module introduces risks on both platforms simultaneously), iOS crash rate post-KMP migration (KMP introduces a new code path for iOS; monitor Firebase Crashlytics for crash signatures that contain Kotlin/Native frames), Swift interop build warning count (SKIE and the KMP iOS framework should compile without warnings; warnings indicate KMP API design decisions that produce non-idiomatic Swift code)
- Pivot Triggers — If the iOS team reports that consuming the KMP module from Swift requires significantly more boilerplate than their existing native implementation: SKIE is not installed or is not generating idiomatic Swift; schedule a dedicated session with the iOS team to review the generated Swift API and refine the Kotlin API design to produce cleaner Swift output
- Long-Term Evolution Plan — Phase 1: shared domain layer (use cases, data models, validation logic); Phase 2: shared data layer (Ktor networking, SQLDelight persistence — requires Retrofit → Ktor migration on Android); Phase 3: shared ViewModel (KMMViewModel for shared presentation state); Phase 4: evaluate Compose Multiplatform for select screens where the UI is identical on both platforms
3. The Answer
The Shared Module Architecture
The KMP project structure: shared/ (the KMP module), androidApp/ (the Android app that depends on shared), iosApp/ (the iOS app built with Xcode that imports the compiled iOS framework from shared). Within the shared/ module, three source sets: commonMain/ (all code that runs on both platforms), androidMain/ (Android-specific actual implementations and Android-specific dependencies), iosMain/ (iOS-specific actual implementations, compiled with Kotlin/Native). The shared/commonMain/ directory contains: domain use cases (GetArticlesUseCase, BookmarkArticleUseCase, SearchArticlesUseCase), domain models (Article, User, Bookmark), repository interfaces (ArticleRepository, UserRepository), and common utilities (Result<T> sealed class, CoroutineDispatchers interface).
expect/actual for Platform Capabilities
Some capabilities exist on both platforms but use different APIs. Use expect/actual for these: platform-specific dispatchers (the Dispatchers.IO dispatcher is Dispatchers.IO on Android but newSingleThreadContext("IO") on iOS — Kotlin/Native does not have Dispatchers.IO):
kotlin
// commonMain
expect val ioDispatcher: CoroutineDispatcher
// androidMain
actual val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
// iosMain
actual val ioDispatcher: CoroutineDispatcher = newSingleThreadContext("IO")
Platform logging (different logging APIs on Android and iOS):
kotlin
// commonMain
expect fun logDebug(tag: String, message: String)
// androidMain
actual fun logDebug(tag: String, message: String) = Log.d(tag, message)
// iosMain
actual fun logDebug(tag: String, message: String) = NSLog("[$tag] $message")
Ktor for Shared Networking
Replace Retrofit (Android-only) with Ktor in the shared module:
kotlin
// commonMain — Ktor HTTP client configuration
val httpClient = HttpClient {
install(ContentNegotiation) { json() }
install(Logging) {
level = LogLevel.BODY
logger = object : Logger {
override fun log(message: String) = logDebug("HTTP", message)
}
}
install(HttpTimeout) { requestTimeoutMillis = 10_000 }
}
// Ktor uses OkHttp engine on Android and Darwin engine on iOS automatically
The ArticleApiService in commonMain uses the shared Ktor client — identical code runs on both platforms, using the platform's native networking layer under the hood.
SQLDelight for Shared Local Storage
Replace Room (Android-only) with SQLDelight:
sql
- commonMain/sqldelight/com/example/db/Article.sqCREATE TABLE Article ( id TEXT NOT NULL PRIMARY KEY, title TEXT NOT NULL, author TEXT NOT NULL, content TEXT NOT NULL, publishedAt INTEGER NOT NULL, isBookmarked INTEGER NOT NULL DEFAULT 0);getAll:
SELECT FROM Article ORDER BY publishedAt DESC;getById:
SELECT FROM Article WHERE id = ?;upsert:
INSERT OR REPLACE INTO Article VALUES (?, ?, ?, ?, ?, ?);
SQLDelight generates type-safe Kotlin code from the SQL. The ArticleDatabase class is identical in commonMain; the platform-specific driver is provided via DI — the Android app provides AndroidSqliteDriver and the iOS app provides NativeSqliteDriver. The ArticleLocalDataSource in commonMain uses the generated ArticleQueries class — identical Kotlin code on both platforms.
Swift Interoperability with SKIE
The raw KMP framework exposes Kotlin suspend fun to Swift as completion handler callbacks and Flow is not accessible directly. SKIE (the co.touchlab.skie Gradle plugin) transforms the Kotlin API into idiomatic Swift:
kotlin
// commonMain — Kotlin API
class GetArticlesUseCase(private val repository: ArticleRepository) {
suspend fun execute(): Result<List<Article>> = repository.getArticles()
val articlesFlow: Flow<List<Article>> = repository.articlesFlow
}
With SKIE, in Swift this becomes:
swift
// Swift — SKIE-generated idiomatic API
let useCase = GetArticlesUseCase(repository: articleRepo)
// Kotlin suspend fun → Swift async/await
let result = try await useCase.execute()
// Kotlin Flow → AsyncSequence (native Swift concurrency)
for await articles in useCase.articlesFlow {
updateUI(articles: articles)
}
Without SKIE, the iOS team would need to write GetArticlesUseCase.shared.execute { result, error in ... } callback-style code for every Kotlin coroutine — significantly more boilerplate that discourages KMP adoption.
Migration Strategy: Domain Layer First
Week 1–2 (Android team): create the shared/ KMP module, define the ArticleRepository interface and domain models in commonMain, migrate the existing Android GetArticlesUseCase to commonMain (pure Kotlin, no platform changes required), update the Android app to use the shared use case (remove the local copy). Week 3–4 (both teams): write comprehensive tests in commonTest (run on JVM via Kotlin/JVM and on a macOS simulator via Kotlin/Native); the tests run in both teams' CI. Week 5–6 (iOS team): consume the shared module from the iOS app; the iOS team implements the ArticleRepository interface (the iOS-native networking implementation for the data layer — the full data layer migration comes in Phase 2) and uses the shared GetArticlesUseCase in their ViewModel. At the end of Phase 1: the same use case logic runs on both platforms, tested once, maintained once.
Organisational Considerations
The shared/ module ownership: the shared module is owned by a "platform guild" — a cross-functional group of 2 Android engineers and 2 iOS engineers who maintain the shared APIs, review the commonMain code, and ensure the Swift API quality. Neither team can add to commonMain without the platform guild's review. The iOS engineers on the platform guild develop proficiency in Kotlin over time; the Android engineers learn the iOS consumption patterns. This creates a shared ownership model that prevents the KMP module from becoming an Android team's codebase that the iOS team reluctantly consumes.
Early Warning Metrics:
commonTestpass rate on both JVM and iOS simulator targets in CI — any test failure that occurs on the iOS target but not the JVM target indicates a Kotlin/Native compatibility issue (a Kotlin feature or library that is not supported in Kotlin/Native); alert immediately and resolve before the iOS app ships
- iOS app binary size increase per KMP module addition — the Kotlin/Native compiled framework adds to the iOS app binary; alert if the
shared.xcframeworkexceeds 5MB for a domain-only module (indicates unintended dependencies being included in the iOS target)
- iOS team use case adoption rate — the percentage of new features where the iOS team uses the shared use case rather than re-implementing it natively; target above 80% for new features after the initial migration; below 50% suggests the Swift API is not idiomatic enough to be preferred over native implementation
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The SKIE plugin recommendation — with the specific contrast between callback-style Swift code (without SKIE) and async/await + AsyncSequence (with SKIE) as the reason iOS teams resist KMP adoption when the interop is not idiomatic — is the production KMP experience detail that determines whether a KMP migration succeeds or fails organisationally. The "platform guild" ownership model (2 Android + 2 iOS engineers reviewing commonMain changes) addresses the political reality that a KMP module owned only by the Android team produces an API that the iOS team resents — a common KMP adoption failure pattern. The expect/actual for ioDispatcher — specifically calling out that Dispatchers.IO does not exist in Kotlin/Native and must be replaced with newSingleThreadContext("IO") — is the specific gotcha that Android developers who have not written Kotlin/Native code for iOS routinely hit on their first KMP project.
What differentiates it from mid-level thinking: A mid-level Android developer would propose "use KMP to share all the code" without knowing about the specific libraries (Ktor instead of Retrofit, SQLDelight instead of Room), would not know about the expect/actual pattern for platform-specific capabilities, would not know about SKIE for Swift interoperability, and would not address the Dispatchers.IO incompatibility in Kotlin/Native. They would not design the platform guild organisational model that determines whether the migration succeeds.
What would make it a 10/10: A 10/10 response would include a complete shared/build.gradle.kts showing the KMP target configuration (including iosArm64(), iosSimulatorArm64(), iosX64() for iPhone, Apple Silicon simulator, and Intel simulator respectively), a concrete commonTest for GetArticlesUseCase using a FakeArticleRepository that runs on both JVM and Kotlin/Native, and a SKIE plugin configuration snippet showing the Swift API generation settings.
Question 11: Dependency Injection — Advanced Hilt Patterns, Testing, and Scoping
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Square, Airbnb, Grab, Booking.com
The Question
You are a Senior Android Developer at an e-commerce company. The team has adopted Hilt for dependency injection but a code review has surfaced several problems: a ProductRepository is scoped to ViewModelComponent (recreated on every screen rotation rather than being a singleton), a UserSessionManager that holds the authentication token is injected directly into 8 different ViewModels without going through a Repository abstraction, an integration test for the checkout ViewModel is not possible because the payment API client cannot be replaced with a test double, and a feature flag library that must be initialised before any Fragment is shown is initialised lazily via a Hilt-provided singleton — but there is no guarantee it has initialised before the first Fragment's onCreateView. Walk through the correct Hilt scoping for each component, how you redesign the UserSessionManager to be testable and architecturally sound, how you replace Hilt components in tests, and how you handle initialisation ordering for critical dependencies.
1. What Is This Question Testing?
- Hilt component hierarchy and scoping — understanding the full Hilt component hierarchy:
SingletonComponent(application lifetime — forOkHttpClient,Retrofit,RoomDatabase,UserSessionManager),ActivityRetainedComponent(survives configuration changes, same lifetime asViewModel— for repositories),ViewModelComponent(one instance per ViewModel instance, destroyed when the ViewModel is cleared),ActivityComponent(one instance per Activity, destroyed ononDestroy()),FragmentComponent,ViewComponent,ServiceComponent; and knowing the@Singleton,@ActivityRetainedScoped,@ViewModelScoped,@ActivityScoped, and@FragmentScopedscope annotations that correspond to each component
- Repository scoping — the common mistake — knowing that a Repository should be
@ActivityRetainedScoped(or@Singletonif truly application-wide), not@ViewModelScoped; a@ViewModelScopedRepository is created fresh for every ViewModel instance — this means two ViewModels on the same screen receive different Repository instances, breaking the "single source of truth" principle; a Repository that maintains an in-memory cache is particularly broken when scoped to the ViewModel because the cache is lost on every configuration change
- Hilt testing with
@UninstallModulesand@TestInstallIn— knowing that Hilt providesHiltAndroidRulefor Android instrumented tests; knowing the two mechanisms for replacing production dependencies with test doubles:@UninstallModules(ProductionModule::class)(removes the production binding at the test class level, then provides a test double in the test class itself), and@TestInstallIn(components = [SingletonComponent::class], replaces = [ProductionModule::class])(replaces a module globally for all tests in the test source set — useful for replacing an API client with aMockWebServeracross all integration tests)
- Hilt for ViewModel testing — knowing that
ViewModelComponentdependencies cannot be tested with the standard Hilt Android test rules (which targetActivityComponentandFragmentComponent); for ViewModel unit tests, Hilt is bypassed entirely — the ViewModel is instantiated directly with manually-created test doubles usingSavedStateHandle()for the saved state parameter;HiltViewModelViewModels can be instantiated without Hilt in unit tests because their dependencies are passed as constructor parameters
@EntryPointfor non-Hilt entry points — knowing that some Android classes cannot have dependencies injected via constructor (e.g.,ContentProvider,BroadcastReceiver,WorkerFactory) because Android instantiates them; Hilt provides@EntryPointto define an interface for accessing the Hilt object graph from classes that Hilt does not directly inject;EntryPointAccessors.fromApplication()retrieves the entry point
- Initialisation ordering with
AppInitializer— knowing the JetpackApp Startuplibrary'sInitializer<T>interface for specifying component initialisation order declaratively; knowing thatclass FeatureFlagInitializer : Initializer<FeatureFlagClient>can declareoverride fun dependencies(): List<Class<out Initializer<*>>> = listOf(OtherLibraryInitializer::class.java)to enforce thatOtherLibraryinitialises beforeFeatureFlagClient; and knowing how to integrateApp Startupwith Hilt (theInitializerruns before Hilt is ready — it stores the result in alateinit varthat Hilt's@Providesfunction reads)
2. Framework: Hilt Architecture and Testing Model (HATM)
- Assumption Documentation — Map every injected class to its correct scope by asking: how long should this object live? (Application lifetime →
@Singleton; ViewModel group lifetime →@ActivityRetainedScoped; single ViewModel →@ViewModelScoped); how many instances should exist? (One per application →@Singleton; one per screen →@ActivityRetainedScoped)
- Constraint Analysis — Changing a scope from
@ViewModelScopedto@ActivityRetainedScopedis a breaking change if any code assumed that different ViewModels received different Repository instances; audit all usages before changing the scope
- Tradeoff Evaluation — Singleton scope for
ProductRepository(simplest, always the same instance) vs.@ActivityRetainedScoped(a new instance per Activity lifetime — appropriate if the repository holds Activity-specific state like the current user's cart); for a product listing repository with no Activity-specific state,@Singletonis correct
- Hidden Cost Identification — Every
@Singletondependency is held in memory for the entire application lifetime; avoid making large in-memory data structures singletons; aProductRepositorycan be@Singletonwhile its in-memory cache uses aWeakReferenceor a time-based eviction strategy
- Risk Signals / Early Warning Metrics — Hilt missing binding error rate in CI (a
MissingBindingerror means a dependency is requested but no@Providesor@Bindsfunction provides it — this always indicates an architecture error, never a runtime exception that should be caught); test double replacement coverage (every external API call in the production code must have a corresponding@TestInstallInor@UninstallModulestest replacement)
- Pivot Triggers — If the Hilt dependency graph becomes so complex that
./gradlew hiltJavaCompileDebugtakes more than 2 minutes: the project has too many modules contributing to a single Hilt component; consider breaking the monolithic@Singletoncomponent across feature modules using Hilt's@DefineComponentfor feature-scoped sub-components
- Long-Term Evolution Plan — Establish a Hilt scope guide (a one-page document in the team wiki mapping each class type to its correct scope) as a pre-read for all new engineers; include scope verification in the code review checklist
3. The Answer
Fix 1: ProductRepository Scope — @ViewModelScoped to @ActivityRetainedScoped
The problem: @ViewModelScoped creates a new ProductRepository instance for every ViewModel. If HomeViewModel and CartViewModel both inject ProductRepository, they receive different instances — each maintains its own independent in-memory product cache. Adding a product to the cart in CartViewModel's cache is not visible to HomeViewModel's cache. The fix: change the scope to @ActivityRetainedScoped (both ViewModels in the same Activity share the same Repository instance) or @Singleton (the entire application shares one instance). For a ProductRepository that maintains a global product catalogue, @Singleton is correct:
kotlin
@Module
@InstallIn(SingletonComponent::class) // Was ViewModelComponent
object RepositoryModule {
@Singleton // Was @ViewModelScoped
@Provides
fun provideProductRepository(
api: ProductApi,
db: ProductDatabase
): ProductRepository = ProductRepositoryImpl(api, db)
}
Fix 2: UserSessionManager — Injected Into 8 ViewModels
The problem: UserSessionManager holds the auth token and is injected directly into 8 ViewModels. This violates the architecture principle that ViewModels should not know about authentication infrastructure — they should receive user-specific data through their domain use cases. The redesign: UserSessionManager is a @Singleton and is injected only into the repositories that need user identity to filter their queries. The ViewModels never see UserSessionManager — they receive the user-filtered data through their use cases:
kotlin
// BEFORE (broken)
@HiltViewModel
class OrderHistoryViewModel @Inject constructor(
private val orderApi: OrderApi,
private val userSessionManager: UserSessionManager // ViewModel knows about auth
) : ViewModel() {
fun loadOrders() {
val userId = userSessionManager.currentUserId // ViewModel handles auth
orderApi.getOrders(userId)
}
}
// AFTER (correct)
@HiltViewModel
class OrderHistoryViewModel @Inject constructor(
private val getOrderHistoryUseCase: GetOrderHistoryUseCase // No auth awareness
) : ViewModel() {
fun loadOrders() = viewModelScope.launch {
getOrderHistoryUseCase() // Use case handles the userId internally
}
}
// GetOrderHistoryUseCase uses UserSessionManager internally
class GetOrderHistoryUseCase @Inject constructor(
private val orderRepository: OrderRepository,
private val userSessionManager: UserSessionManager
) {
suspend operator fun invoke(): Flow<List<Order>> =
orderRepository.getOrdersForUser(userSessionManager.currentUserId)
}
Fix 3: Replacing the Payment API Client in Tests — @TestInstallIn
The problem: PaymentApiClient (a Retrofit service) is bound in NetworkModule which is @InstallIn(SingletonComponent::class); the checkout ViewModel integration test cannot replace it with a test double. Fix: use @TestInstallIn to replace NetworkModule globally in the androidTest source set with a fake that uses MockWebServer:
kotlin
// androidTest/kotlin/com/example/TestNetworkModule.kt
@TestInstallIn(
components = [SingletonComponent::class],
replaces = [NetworkModule::class]
)
@Module
object TestNetworkModule {
@Provides
@Singleton
fun providePaymentApiClient(): PaymentApiClient {
val mockWebServer = MockWebServer()
// Return a Retrofit-created client pointing to the MockWebServer
return Retrofit.Builder()
.baseUrl(mockWebServer.url("/"))
.addConverterFactory(GsonConverterFactory.create())
.build()
.create(PaymentApiClient::class.java)
}
}
For ViewModel unit tests (not instrumented tests), Hilt is not involved — instantiate the ViewModel with constructor injection using manually-created fakes:
kotlin
// Unit test — no Hilt, direct ViewModel instantiation
@Test
fun checkout_succeeds_updatesUiStateToConfirmed() = runTest {
val fakePaymentRepo = FakePaymentRepository(shouldSucceed = true)
val viewModel = CheckoutViewModel(
processPaymentUseCase = ProcessPaymentUseCase(fakePaymentRepo),
savedStateHandle = SavedStateHandle(mapOf("orderId" to "order_123"))
)
viewModel.onSubmitPaymentClicked()
advanceUntilIdle()
assertThat(viewModel.uiState.value).isInstanceOf(CheckoutUiState.Confirmed::class.java)
}
Fix 4: Feature Flag Initialisation Ordering — App Startup + Hilt
The problem: the feature flag library is a @Singleton provided by Hilt's SingletonComponent, but Hilt's singleton component is not created until after Application.onCreate() returns — too late if a ContentProvider or early Fragment attempts to read a feature flag. Fix: use the Jetpack App Startup library to initialise the feature flag client before Hilt creates its component, and then have Hilt's @Provides read the already-initialised client:
kotlin
// App Startup Initializer — runs before Application.onCreate()
class FeatureFlagInitializer : Initializer<FeatureFlagClient> {
override fun create(context: Context): FeatureFlagClient {
val client = FeatureFlagClient(context)
client.initialize() // Blocking initialisation — complete before any Fragment runs
FeatureFlagHolder.instance = client // Store in a singleton holder
return client
}
override fun dependencies(): List<Class<out Initializer<*>>> = emptyList()
}
// Singleton holder — bridges App Startup and Hilt
object FeatureFlagHolder {
lateinit var instance: FeatureFlagClient
}
// Hilt module — reads the already-initialised client from the holder
@Module
@InstallIn(SingletonComponent::class)
object FeatureFlagModule {
@Provides
@Singleton
fun provideFeatureFlagClient(): FeatureFlagClient = FeatureFlagHolder.instance
// Safe because App Startup guarantees initialisation before the first Activity
}
Declare the initializer in AndroidManifest.xml inside the androidx.startup.InitializationProvider:
xml
<provider android:name="androidx.startup.InitializationProvider" ...>
<meta-data android:name="com.example.FeatureFlagInitializer"
android:value="androidx.startup" />
</provider>
Early Warning Metrics:
@Singletoncount in the Hilt dependency graph (run./gradlew generateDebugHiltComponentsand inspect the generatedDaggerApplicationComponent; a count above 50@Singletonbindings indicates scope creep — dependencies are being made singletons by default rather than by design)
- Hilt missing binding errors in CI (zero tolerance; any
MissingBindingin a CI build indicates a new dependency was added to a ViewModel constructor without a corresponding@Providesfunction; this must fail the build, not the runtime)
- Test double replacement coverage per external API surface — every
@Providesfunction that provides an external API client must have a corresponding@TestInstallInreplacement in theandroidTestsource set; a new API client added without a test replacement is a coverage gap flagged in the PR review
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The UserSessionManager redesign — moving it from being directly injected into 8 ViewModels to being an implementation detail of the use cases — shows architectural thinking about where authentication context belongs in a Clean Architecture Android app. The FeatureFlagHolder bridge pattern (App Startup initialises the client into a holder, Hilt reads the holder) is the specific production solution to the "Hilt isn't ready during early initialisation" problem that apps with analytics and feature flag SDKs routinely encounter. The distinction between @TestInstallIn (global replacement for all tests) and @UninstallModules (per-test replacement) is the Hilt testing precision that determines whether the test strategy is maintainable.
What differentiates it from mid-level thinking: A mid-level Android developer would know about Hilt's component hierarchy but not recognise that @ViewModelScoped for a Repository violates the single-source-of-truth principle, would not know the difference between @TestInstallIn and @UninstallModules, and would not know about App Startup's Initializer for pre-Hilt dependency initialisation. They would also not design the UserSessionManager to be hidden from ViewModels behind the use case layer.
What would make it a 10/10: A 10/10 response would include a complete Hilt dependency graph diagram showing all 4 component scopes for the e-commerce app, a concrete @DefineComponent feature-scoped sub-component for a feature module that has its own scoped dependencies, and a HiltAndroidTest integration test showing the MockWebServer enqueue and the checkout ViewModel's response to a simulated payment failure.
Question 12: Offline-First Architecture — Network Layer, Caching, and Data Synchronisation
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google Maps, Spotify, WhatsApp, Notion, Linear
The Question
You are a Senior Android Developer at a field services company. The app is used by field engineers who perform maintenance inspections in facilities with unreliable or no network connectivity. The app must: display a list of assigned work orders fetched from a REST API, allow engineers to fill in inspection forms (which can have 50+ fields), attach photos taken on-site, submit completed work orders when connectivity is restored, and handle the case where a work order is updated on the server while an engineer is editing it offline — a conflict that must be detected and resolved. The current app makes live API calls for everything and completely fails when offline. Walk through the offline-first architecture using Room, OkHttp caching, WorkManager, and conflict resolution strategy.
1. What Is This Question Testing?
- Room as the single source of truth — understanding that in an offline-first architecture, the UI always reads from the local Room database — never directly from the API; the API is the upstream data source that populates Room, not the source the UI consumes; the data flow is: Repository fetches from API → Repository writes to Room → Room
Flowemits updates → ViewModel collects the Room Flow → UI renders; this pattern means the UI always shows cached data (even when offline) and updates reactively when fresh data arrives
- OkHttp network response caching — knowing that OkHttp's
Cacheclass caches HTTP responses according to the server'sCache-Controlheaders; knowing thatCacheControl.FORCE_CACHEforces OkHttp to serve from the cache regardless of freshness (useful for truly offline mode), andCacheControl.FORCE_NETWORKbypasses the cache (useful for manual refresh); knowing that OkHttp's cache is appropriate for GET responses (reading work order lists) but not for POST/PUT (submitting work orders — these require custom offline queue logic)
- WorkManager for offline submission queue — understanding that work order submission when offline requires an offline queue: store the pending submission in Room with a "pending" status, enqueue a
OneTimeWorkRequestwithNetworkType.CONNECTEDconstraint, and process the submission when connectivity is restored; knowing the idempotency requirement: if WorkManager retries the submission (the first attempt failed mid-transmission), the server must not create duplicate work orders — use an idempotency key (the local work order ID) in the request header
- Conflict detection and resolution — knowing the optimistic concurrency pattern: the server returns an
ETag(orversionfield) with each work order; the client includes theIf-Match: <etag>header when submitting updates; if the server's current version does not match the client'sIf-Matchheader, the server returnsHTTP 412 Precondition Failed, indicating a conflict; the client must then fetch the server's current version and present the engineer with a conflict resolution UI
- Photo attachment handling in offline mode — knowing that photo attachments are large binary files that cannot be stored in Room (Room stores metadata, not binary blobs); photos are stored in the device's internal storage (
Context.filesDir) or scoped external storage; the Room entity stores the file path (a URI string); the attachment upload is a separateWorkRequestthat runs after the work order submission completes — WorkManager's chaining ensures this ordering
Flow.combinefor offline status indicator — knowing how to expose a combined UI state that reflects both the local data and the network connectivity status;combine(workOrdersFlow, connectivityFlow) { orders, isConnected -> WorkOrderUiState(orders, isConnected) }produces a state that the UI uses to show an "offline mode" banner without interrupting the engineer's workflow
2. Framework: Offline-First Architecture Design Model (OFADM)
- Assumption Documentation — Establish the data model for offline support: which entities must be available offline (work orders, inspection forms, facility information), which can be online-only (historical reports, admin configuration), and what the maximum acceptable staleness is for each entity (work order assignments: sync every 30 minutes; facility photos: sync on demand)
- Constraint Analysis — Photos can be 3–8MB each; a work order with 10 photos requires 30–80MB of local storage; the offline architecture must respect Android's storage constraints (scoped storage for Android 10+, the
ACTION_OPEN_DOCUMENTpicker for accessing files outside the app's directory)
- Tradeoff Evaluation — OkHttp cache for GET responses (simple, automatic, works with standard HTTP caching) vs. always writing to Room then reading from Room (more code, more control, supports complex queries and offline mutation); for a forms-based app where engineers edit data offline, Room is essential — OkHttp cache handles read caching but cannot handle offline writes
- Hidden Cost Identification — The "pending sync" state for work orders creates a new UI requirement: the work order list must visually distinguish between synced items, locally edited (pending sync) items, and items being actively synced; this tri-state indicator must be designed before the implementation begins
- Risk Signals / Early Warning Metrics — Pending sync queue size (the number of work orders in Room with status PENDING_SYNC; a queue that grows without clearing indicates the WorkManager sync is not executing — investigate network constraint configuration or Hilt binding for the Worker), conflict detection rate (the percentage of submissions that receive HTTP 412; above 5% indicates the work order assignment process is creating concurrent edit scenarios that the UX should prevent)
- Pivot Triggers — If the conflict detection rate exceeds 15% during field trials: the conflict is structural (engineers are being assigned to the same work order simultaneously); the backend assignment model must be changed, not just the client-side conflict resolution UX
- Long-Term Evolution Plan — Phase 1: Room as single source of truth for work orders (read-only offline support); Phase 2: offline submission queue with WorkManager; Phase 3: photo attachment offline support; Phase 4: conflict resolution UI; Phase 5: delta sync (only sync changed fields, not entire work orders — reduces bandwidth for large forms)
3. The Answer
The Core Pattern: Repository as the Single Source of Truth Mediator
kotlin
class WorkOrderRepository @Inject constructor(
private val api: WorkOrderApi,
private val db: WorkOrderDatabase,
private val connectivityMonitor: ConnectivityMonitor
) {
// UI always reads from Room — never from the API directly
fun getAssignedWorkOrders(): Flow<List<WorkOrder>> =
db.workOrderDao().getAll() // Room Flow — automatically emits on data change
// Refresh from network (called by ViewModel on pull-to-refresh or scheduled sync)
suspend fun refreshFromNetwork() {
if (!connectivityMonitor.isConnected) return // Don't attempt network call offline
try {
val serverWorkOrders = api.getAssignedWorkOrders()
db.withTransaction {
serverWorkOrders.forEach { serverOrder ->
val localOrder = db.workOrderDao().getById(serverOrder.id)
if (localOrder?.status == WorkOrderStatus.PENDING_SYNC) {
// Don't overwrite locally edited data with server data
// until the local edit is submitted
} else {
db.workOrderDao().upsert(serverOrder)
}
}
}
} catch (e: IOException) {
// Network failure — Room already has cached data, UI is unaffected
Timber.w(e, "Network refresh failed, serving cached data")
}
}
}
The db.workOrderDao().getAll() returns a Flow<List<WorkOrder>> — Room emits a new list whenever any work order is inserted, updated, or deleted. The UI is always reactive to the latest Room state.
Offline Form Editing and the PENDING_SYNC State
When an engineer edits a work order form (changing inspection fields, adding notes), the changes are saved to Room immediately with status = WorkOrderStatus.PENDING_SYNC. The ViewModel saves on every field change (debounced by 500ms to avoid excessive writes):
kotlin
@HiltViewModel
class WorkOrderEditViewModel @Inject constructor(
private val workOrderRepository: WorkOrderRepository,
savedStateHandle: SavedStateHandle
) : ViewModel() {
private val workOrderId = savedStateHandle.get<String>("workOrderId")!!
fun onFieldChanged(fieldId: String, value: String) {
viewModelScope.launch {
workOrderRepository.updateField(workOrderId, fieldId, value)
// Enqueue a WorkManager sync job (does nothing if already queued)
WorkManager.getInstance(context).enqueueUniqueWork(
"sync_$workOrderId",
ExistingWorkPolicy.KEEP, // Don't re-enqueue if sync is already pending
OneTimeWorkRequestBuilder<WorkOrderSyncWorker>()
.setInputData(workDataOf(KEY_WORK_ORDER_ID to workOrderId))
.setConstraints(Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.build())
.build()
)
}
}
}
WorkManager Sync Worker with Conflict Detection
kotlin
class WorkOrderSyncWorker @AssistedInject constructor(
@Assisted context: Context,
@Assisted workerParams: WorkerParameters,
private val workOrderRepository: WorkOrderRepository
) : CoroutineWorker(context, workerParams) {
override suspend fun doWork(): Result {
val workOrderId = inputData.getString(KEY_WORK_ORDER_ID) ?: return Result.failure()
val localOrder = workOrderRepository.getLocalWorkOrder(workOrderId)
?: return Result.failure()
return try {
val response = workOrderRepository.submitWorkOrder(
workOrder = localOrder,
etag = localOrder.serverEtag, // Used for conflict detection
idempotencyKey = workOrderId // Prevents duplicate submission on retry
)
workOrderRepository.markAsSynced(workOrderId, newEtag = response.etag)
Result.success()
} catch (e: ConflictException) {
// HTTP 412 — server version has changed while we were editing
val serverVersion = workOrderRepository.fetchServerVersion(workOrderId)
workOrderRepository.markAsConflicted(workOrderId, serverVersion)
// Don't retry — show conflict resolution UI to the engineer
Result.failure(workDataOf(KEY_CONFLICT to true))
} catch (e: IOException) {
Result.retry() // Network error — WorkManager will retry with exponential backoff
}
}
}
Conflict Resolution UI
When a work order is marked as CONFLICTED, the UI shows a conflict resolution screen presenting the engineer's local version and the server's current version side by side:
kotlin
@Composable
fun ConflictResolutionScreen(
localVersion: WorkOrder,
serverVersion: WorkOrder,
onKeepLocal: () -> Unit, // Discard server changes, re-submit local version
onAcceptServer: () -> Unit, // Discard local changes, use server version
onMergeManually: () -> Unit // Open form with both versions visible for manual merge
) {
// Show diff of changed fields between localVersion and serverVersion
// Each field that differs is highlighted with both values shown
}
Photo Attachments with WorkManager Chaining
Photos are stored in the app's internal files directory and their paths are stored in Room:
kotlin
fun attachPhoto(workOrderId: String, photoUri: Uri): Flow<AttachmentState> = flow {
emit(AttachmentState.Saving)
val localPath = photoStorage.copyToInternalStorage(photoUri)
db.attachmentDao().insert(Attachment(workOrderId, localPath, status = PENDING_UPLOAD))
emit(AttachmentState.Saved(localPath))
// Chain: sync work order first, then upload photos
val syncWork = OneTimeWorkRequestBuilder<WorkOrderSyncWorker>()
.setConstraints(networkConnectedConstraint).build()
val uploadWork = OneTimeWorkRequestBuilder<PhotoUploadWorker>()
.setInputData(workDataOf(KEY_WORK_ORDER_ID to workOrderId))
.setConstraints(networkConnectedConstraint).build()
WorkManager.getInstance(context)
.beginUniqueWork("sync_$workOrderId", ExistingWorkPolicy.KEEP, syncWork)
.then(uploadWork) // Photos upload only after work order is synced
.enqueue()
}
Early Warning Metrics:
- Pending sync queue size per engineer (daily) — the number of work orders in PENDING_SYNC status per user; above 5 pending items indicates the sync is not completing on connectivity restoration; investigate WorkManager execution logs on affected devices
- Conflict rate by work order type — segmented by work order type (emergency maintenance has higher conflict rate than scheduled inspection); a high conflict rate for a specific type signals a dispatch process problem on the server side
- Sync completion time from connectivity restoration to SYNCED status — measure with a Firebase Performance custom trace starting when
ConnectivityManager.NetworkCallback.onAvailable()fires and ending when the last PENDING_SYNC work order transitions to SYNCED; target under 90 seconds for a typical 5-field work order; above 3 minutes triggers investigation of the Worker's backoff policy
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The refreshFromNetwork() function's check for localOrder?.status == WorkOrderStatus.PENDING_SYNC before overwriting with server data — preventing a server sync from discarding an engineer's unsaved offline work — is the specific race condition protection that field service apps routinely miss in their first implementation and discover during field trials. The ExistingWorkPolicy.KEEP for the sync WorkRequest (vs. REPLACE) ensures that if the engineer makes multiple rapid field changes, only one sync job is queued rather than replacing the queued job with a new one (which would reset the delay). The HTTP 412 Precondition Failed → ConflictException → Result.failure() path — deliberately not retrying conflicts — is the correct design because a conflict requires human resolution, not automatic retry.
What differentiates it from mid-level thinking: A mid-level Android developer would implement offline support by caching API responses in SharedPreferences or OkHttp's cache (which cannot handle offline mutations), would not design the PENDING_SYNC status for tracking local changes, would not know about HTTP If-Match / ETag for conflict detection, and would not chain the photo upload WorkRequest after the work order sync WorkRequest.
What would make it a 10/10: A 10/10 response would include the complete Room Entity for WorkOrder (showing the status, serverEtag, and locallyModifiedAt fields), a concrete WorkOrderDao with the @Transaction query that updates a field and sets status = PENDING_SYNC atomically, and a ConflictResolutionViewModel that computes the field-by-field diff between the local and server versions.
Question 13: Deep Links and App Links — Implementing Universal Link Handling in a Complex App
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Airbnb, Uber, Twitter, Spotify
The Question
You are a Senior Android Developer at a travel booking company. The app has a complex navigation structure built on the Jetpack Navigation component. The marketing team wants to implement deep linking so that: (1) clicking a link in an email takes users directly to a specific hotel detail page — even if the app is not installed (the link must open the Play Store if the app is missing); (2) a push notification for a booking confirmation navigates the user to their booking detail screen with the booking ID pre-populated; (3) clicking a restaurant recommendation from a third-party travel aggregator app should open the restaurant detail page in your app — without exposing your internal URL structure to the aggregator; (4) when the user opens the hotel detail page via a deep link while they are not logged in, they are taken to the login screen first and then redirected to the hotel detail page after login completes. Walk through the implementation of Android App Links, the Navigation component integration, the intent filter configuration, and the post-login deep link redemption pattern.
1. What Is This Question Testing?
- Android App Links vs. custom scheme deep links — knowing the difference: custom scheme deep links (
booking://hotels/456) are handled exclusively by the app (if installed) but cannot be opened by a browser or email client on a device without the app; Android App Links (https://booking.company.com/hotels/456) are verified HTTPS links that open the app directly without the disambiguation dialog — the key requirement is the Digital Asset Links JSON file hosted athttps://booking.company.com/.well-known/assetlinks.jsonthat proves the link domain is owned by the app's signing certificate; knowing that App Links require theandroid:autoVerify="true"attribute on the intent filter and require theassetlinks.jsonto be correctly served before the OS will trust the links
- Navigation component deep link handling — knowing the two approaches: explicit deep link creation (
NavDeepLinkBuilder) for constructing intents to specific destinations programmatically (used for push notifications), and implicit deep links (URI patterns defined in the navigation graph's<deepLink>elements that are automatically matched when the app is opened via a matching intent); knowing that implicit deep links defined in the nav graph automatically register the corresponding intent filters via the Gradlenavigation.safeargsplugin
- Handling deep links in a Compose Navigation app — knowing that Compose Navigation's
rememberNavController()andNavHosthandle deep links throughintent.dataandintent.extras; knowing thatnavController.handleDeepLink(intent)is called inLaunchedEffect(intent)in theNavHostto navigate to the deep linked destination
- Push notification deep links with
NavDeepLinkBuilder— knowing that Firebase Cloud Messaging notifications are received inFirebaseMessagingService.onMessageReceived(); the push notification payload contains the deep link destination (booking ID, destination type); thePendingIntentfor the notification tap is built withNavDeepLinkBuilderto create a back stack that includes the home screen, the bookings list, and the booking detail — so the back button works correctly after the user opens the notification
- The post-login deep link redemption pattern — knowing that the most common implementation mistake is that the deep link URI is lost when the app redirects to the login screen; the correct pattern: intercept the deep link before navigation, check the authentication state, if not authenticated — save the deep link URI to a
PendingDeepLinkholder (orSavedStateHandle) and navigate to login; after login succeeds, the login ViewModel reads thePendingDeepLinkand navigates to the saved URI; this ensures the deep link is redeemed after authentication
- App-to-app deep links without exposing internal URLs — knowing that
Intent.ACTION_VIEWwith a custom MIME type or action string is the inter-app communication pattern that does not require exposing URL structure; the third-party app sends an intent withaction = "com.yourapp.OPEN_RESTAURANT"and the restaurant ID as an extra — your app's Activity registers an intent filter for this action; no URL is exposed, and the third-party app does not need to know your URL scheme
2. Framework: Deep Link Architecture Design Model (DLADM)
- Assumption Documentation — Map all deep link entry points: which screens can be entered via deep link (hotel detail, booking detail, restaurant detail, search results, user profile), what parameters each requires (hotel ID, booking ID, restaurant ID, search query), and which screens require authentication (booking detail requires login; hotel detail does not)
- Constraint Analysis —
assetlinks.jsonmust be served from the exact domain in the intent filter over HTTPS withContent-Type: application/json— any misconfiguration causes the App Links to fall back to the disambiguation dialog; set up monitoring for theassetlinks.jsonendpoint to alert if it becomes unavailable
- Tradeoff Evaluation — Implicit deep links (URI patterns in the nav graph) vs. explicit deep links (NavDeepLinkBuilder): implicit deep links are simpler and self-documenting but require the URI to match exactly; explicit deep links (built in code) are more flexible for complex navigation stacks like notification deep links that must build a complete back stack
- Hidden Cost Identification — App Links verification can take up to 20 seconds on first install (the OS must verify the
assetlinks.jsonduring or shortly after installation); during this window, App Links fall back to the disambiguation dialog; this is a known Android limitation that cannot be worked around programmatically — inform the marketing team that links clicked immediately after install may not work seamlessly
- Risk Signals / Early Warning Metrics — App Links verification success rate (use
adb shell pm get-app-links <package>to verify the link verification status on test devices; a status ofverifiedfor each domain confirms the verification succeeded), deep link navigation success rate in Firebase Analytics (log adeep_link_successevent when the target screen is reached and adeep_link_failedevent when navigation fails; a high failure rate for a specific link pattern indicates a nav graph misconfiguration)
- Pivot Triggers — If the
assetlinks.jsonverification fails for a significant number of users (App Links fall back to disambiguation dialogs): check whether the server is serving the correctContent-Type: application/jsonheader and whether the SHA-256 certificate fingerprint inassetlinks.jsonmatches the release signing certificate (not the debug certificate, which is different)
- Long-Term Evolution Plan — Centralise deep link routing in a
DeepLinkRouterclass that parses all incoming URIs and routes them to the correct navigation destination; this makes it easy to add new deep link destinations, log deep link analytics, and handle authentication guards without duplicating logic across multiple entry points
3. The Answer
Requirement 1: App Links — Hotel Detail from Email
Configure the intent filter in AndroidManifest.xml with android:autoVerify="true":
xml
<activity android:name=".MainActivity" android:exported="true">
<intent-filter android:autoVerify="true">
<action android:name="android.intent.action.VIEW" />
<category android:name="android.intent.category.DEFAULT" />
<category android:name="android.intent.category.BROWSABLE" />
<data
android:scheme="https"
android:host="booking.company.com"
android:pathPrefix="/hotels/" />
</intent-filter>
</activity>
Create assetlinks.json at https://booking.company.com/.well-known/assetlinks.json:
json
[{
"relation": ["delegate_permission/common.handle_all_urls"],
"target": {
"namespace": "android_app",
"package_name": "com.company.booking",
"sha256_cert_fingerprints": ["AA:BB:CC:..."] // Release signing certificate fingerprint
}
}]
In the Compose Navigation graph, define the deep link for the hotel detail destination:
kotlin
composable(
route = "hotel/{hotelId}",
deepLinks = listOf(navDeepLink {
uriPattern = "https://booking.company.com/hotels/{hotelId}"
})
) { backStackEntry ->
val hotelId = backStackEntry.arguments?.getString("hotelId")
HotelDetailScreen(hotelId = hotelId!!)
}
Requirement 2: Push Notification Deep Link with Back Stack
In FirebaseMessagingService.onMessageReceived(), build the notification PendingIntent with a complete back stack:
kotlin
override fun onMessageReceived(message: RemoteMessage) {
val bookingId = message.data["bookingId"] ?: return
val pendingIntent = NavDeepLinkBuilder(applicationContext)
.setGraph(R.navigation.nav_graph)
.setDestination(R.id.bookingDetailFragment)
.setArguments(bundleOf("bookingId" to bookingId))
.createPendingIntent() // Builds back stack: Home → Bookings List → Booking Detail
val notification = NotificationCompat.Builder(this, CHANNEL_BOOKINGS)
.setContentTitle("Booking Confirmed!")
.setContentText("Your booking #$bookingId is confirmed")
.setContentIntent(pendingIntent)
.setAutoCancel(true)
.build()
NotificationManagerCompat.from(this).notify(bookingId.hashCode(), notification)
}
The NavDeepLinkBuilder automatically creates the correct back stack so the user can press back from the booking detail screen and land on the bookings list, not exit the app.
Requirement 3: App-to-App Deep Link Without Exposing URL Structure
Register a custom intent action in AndroidManifest.xml:
xml
<activity android:name=".MainActivity">
<intent-filter>
<action android:name="com.company.booking.OPEN_RESTAURANT" />
<category android:name="android.intent.category.DEFAULT" />
<data android:mimeType="vnd.company.booking/restaurant" />
</intent-filter>
</activity>
The third-party aggregator app sends:
kotlin
val intent = Intent("com.company.booking.OPEN_RESTAURANT").apply {
type = "vnd.company.booking/restaurant"
putExtra("restaurantId", "rest_789")
}
startActivity(intent)
Your MainActivity receives this intent and navigates to the restaurant detail screen. The aggregator knows only the action string and MIME type — not your URL structure, routing logic, or other screens. In MainActivity.onCreate():
kotlin
if (intent.action == "com.company.booking.OPEN_RESTAURANT") {
val restaurantId = intent.getStringExtra("restaurantId")
navController.navigate(RestaurantDetailRoute(restaurantId))
}
Requirement 4: Post-Login Deep Link Redemption
The PendingDeepLink pattern stores the intended destination before redirecting to login:
kotlin
// In the deep link interceptor (called before navigating to any destination)
fun handleIncomingDeepLink(uri: Uri, navController: NavController) {
val requiresAuth = deepLinkRequiresAuthentication(uri)
if (requiresAuth && !authManager.isLoggedIn) {
// Save the intended destination before redirecting
pendingDeepLinkStore.save(uri)
navController.navigate(LoginRoute)
} else {
navController.navigate(uri)
}
}
// In LoginViewModel — called after successful login
fun onLoginSuccess() {
val pendingDeepLink = pendingDeepLinkStore.consume() // Reads and clears the stored URI
if (pendingDeepLink != null) {
_navigationEvent.emit(NavigationEvent.DeepLink(pendingDeepLink))
} else {
_navigationEvent.emit(NavigationEvent.Home)
}
}
The pendingDeepLinkStore persists the URI to EncryptedSharedPreferences (not in-memory — the process may be killed during login on low-memory devices) and clears it after consumption (consume() reads and deletes atomically). The navController.navigate(pendingDeepLink) in the UI after login uses the full URI (matching the implicit deep link pattern in the nav graph) to navigate to the correct screen with the correct arguments.
Early Warning Metrics:
- App Links verification status per device in Firebase Analytics — log
adb shell pm get-app-links <package>output as a custom event on app startup for debug builds; anot_verifiedstatus indicates theassetlinks.jsonis not being correctly served on the production domain
- Deep link to screen conversion rate — the percentage of incoming deep links that result in the target screen being displayed (vs. the user landing on the home screen due to a routing error or authentication redirect without redemption); target above 90%; below 80% indicates the post-login redemption pattern is failing for some deep link types
- Pending deep link stale rate — the percentage of stored
pendingDeepLinkURIs that are never consumed (the user abandoned the login flow); high stale rates indicate the UX of the login-to-deep-link redemption is not intuitive; consider surfacing the "you were trying to view X" message on the login screen to remind the user of their intent
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The pendingDeepLinkStore.consume() pattern — specifically using EncryptedSharedPreferences (not in-memory state) to persist the pending deep link URI because the process may be killed during the login flow on low-memory devices — is the edge case that causes "I completed login and the deep link was lost" bugs that are reported by users but difficult to reproduce in development. The app-to-app intent with a custom MIME type (vnd.company.booking/restaurant) — rather than a URL scheme — as the clean encapsulation boundary is the specific Android inter-app communication pattern that does not expose internal routing. The NavDeepLinkBuilder producing a complete back stack (Home → Bookings List → Booking Detail) so the back button works correctly after a notification tap is the user experience detail that separates production-quality notification deep links from the naive "navigate to destination directly, breaking the back stack" implementation.
What differentiates it from mid-level thinking: A mid-level Android developer would implement custom scheme deep links (not App Links) without knowing about the assetlinks.json requirement, would not know about NavDeepLinkBuilder for building notification back stacks, would not design the pending deep link store for post-login redemption (losing the deep link on login redirect), and would use URL-based inter-app communication (exposing the URL structure to the aggregator).
What would make it a 10/10: A 10/10 response would include the complete assetlinks.json generation script (showing how to extract the SHA-256 certificate fingerprint from the release keystore using keytool), a DeepLinkRouter class implementation that centralises all URI parsing and routing decisions, and a DeepLinkIntegrationTest using ActivityScenario and Intent to verify that each deep link URI pattern navigates to the correct destination.
Question 14: Memory Management — Detecting and Fixing Memory Leaks in Android
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, Square (LeakCanary creators), Instagram, Airbnb, Dropbox
The Question
You are a Senior Android Developer at a social media company. The app has been receiving crash reports with OutOfMemoryError affecting 2.3% of daily active users — particularly on devices with 3GB or less RAM. Firebase Crashlytics shows the crashes are not clustered in one feature area; they occur across multiple screens after the app has been running for 20–40 minutes. LeakCanary has been enabled in the debug build and has detected 4 distinct leak patterns in the last sprint. Analyse each of the following 4 detected leaks and provide the specific fix: (1) A Fragment is retained in a List<WeakReference<Fragment>> that the app maintains for a "fragment history" feature, but the WeakReferences are not being cleared — they hold strong references to the Fragment which holds a reference to the View and the Activity; (2) A Singleton EventBus holds subscribers as strong references — a Fragment that subscribes in onCreateView but unsubscribes in onDestroyView leaks during the window between these two lifecycle events; (3) The ImageView's OnClickListener captures a reference to the Activity via a lambda that is stored in a static field; (4) A CoroutineScope created with CoroutineScope(Dispatchers.Main) inside a Fragment (not viewLifecycleOwner.lifecycleScope) continues running after the Fragment is destroyed, holding a reference to the Fragment.
1. What Is This Question Testing?
- Memory leak patterns in Android — understanding the 4 common Android memory leak categories: static references to Contexts or Views (the most common — a static field that holds an Activity or Fragment reference prevents GC even after the screen is finished), inner class references (an anonymous inner class
OnClickListenerimplicitly holds a reference to its outer class — the Activity), lifecycle mismatch (subscribing in an earlier lifecycle callback but unsubscribing in a later one creates a window where the subscription leaks the subscriber), and leaked coroutine scopes (a coroutine scope that is not cancelled when a component is destroyed holds the component in memory via the coroutine'sJoband any captured references)
WeakReferencevs.SoftReferencevs. strong reference — knowing thatWeakReference<T>allows the GC to collect the referenced object when there are no strong references — but aWeakReferencestored in aListdoes not automatically remove itself from the list when its referent is collected; theList<WeakReference<Fragment>>still contains theWeakReferenceobject even after theFragmentis GC'd; the list must be periodically pruned of deadWeakReferences usingweakRef.get() == nullas the dead-reference check
- The Fragment View lifecycle — understanding that
Fragment.getView()is not the same as the Fragment itself; a Fragment can be in the back stack (retained in memory) while its View is destroyed; listeners and observers attached to the View must be cleaned up inonDestroyView()(notonDestroy()), because the View lifecycle is shorter than the Fragment lifecycle; the correct pattern for observing LiveData and StateFlow in a Fragment isviewLifecycleOwner.lifecycleScope(notlifecycleScope), which is scoped to the View lifecycle
- Static field leaks — knowing that any
staticfield (in Java) orcompanion objectproperty (in Kotlin) holds a reference for the lifetime of the class loader (effectively the entire application lifetime); storing anActivity,Fragment,View, or any object that holds a reference to these in a static field is always a leak unless the static field is explicitly cleared when the component is destroyed;WeakReferencein static fields is the correct pattern for caches that should not prevent GC
- LeakCanary's leak trace interpretation — knowing how to read a LeakCanary leak trace: the trace shows the reference chain from GC root (a static field, a thread, a JVM thread-local) through the object graph to the leaked object; the "suspect" reference (the one that should have been cleared) is highlighted; knowing that the fix is always to break the reference chain at the highlighted suspect
- Coroutine scope and Fragment lifecycle — knowing the distinction between
lifecycleScope(scoped to the Fragment's lifecycle — cancelled inonDestroy()) andviewLifecycleOwner.lifecycleScope(scoped to the Fragment's view lifecycle — cancelled inonDestroyView()); for operations that interact with the View (UI updates, collecting UI state),viewLifecycleOwner.lifecycleScopeis always correct because the View does not exist outside the View lifecycle
2. Framework: Memory Leak Detection and Remediation Model (MLDRM)
- Assumption Documentation — Before fixing any leak, confirm the leak using LeakCanary's heap dump analysis: identify the GC root (the object that prevents GC — often a static field or a running thread), the leaked object (the Fragment, Activity, or View), and the reference chain that connects them; the fix is always at the reference chain, not at the leaked object
- Constraint Analysis — LeakCanary is only enabled in debug builds; production leak monitoring must use alternative signals:
OutOfMemoryErrorcrash rate in Firebase Crashlytics, memory usage trend in Firebase Performance Monitoring (the P95 heap usage should be flat over a session; an upward trend indicates a leak)
- Tradeoff Evaluation — Fix the leak by breaking the reference at the earliest possible point (cleaning up in
onDestroyView()rather thanonDestroy()for View-related resources) vs. usingWeakReferenceto prevent the leak (allows the reference to exist but prevents it from blocking GC); the preferred approach is breaking the reference explicitly —WeakReferenceis a fallback for cases where explicit cleanup is not possible
- Hidden Cost Identification — The
OutOfMemoryErroraffects only users on low-RAM devices (3GB and below), but the leaked objects exist on all devices — the leak just does not cause a crash on high-RAM devices because they have enough headroom; fix the leak for all users, not just the low-RAM affected users
- Risk Signals / Early Warning Metrics — LeakCanary leak count per build in CI (integrate LeakCanary's
AppWatcher.objectWatcherwith a CI test that simulates the leak-triggering user flows and asserts that no new leaks are detected), heap usage P95 trend in Firebase Performance (alert if P95 heap usage increases by more than 20MB between two consecutive releases),OutOfMemoryErrorrate by device RAM tier (segment crash data by device RAM — a crash rate that is 10× higher on 2GB devices than on 6GB devices confirms a leak, not a general OOM issue)
- Pivot Triggers — If after fixing all 4 known leaks, the
OutOfMemoryErrorrate does not decrease by more than 50%: there are additional leaks not yet detected by LeakCanary; enable LeakCanary's heap dump in the beta build for a small percentage of beta users (not all users — heap dumps are 10–50MB and cause a visible freeze) to collect additional leak traces from the production user flows that the debug testing did not cover
- Long-Term Evolution Plan — Sprint 1: fix all 4 detected leaks; Sprint 2: add LeakCanary UI test integration (run
AppWatcher.objectWatcher.assertNoMoreLeak()after every UI test that navigates away from a screen); Sprint 3: add a custom Kotlin lint rule that flagscompanion objectproperties that hold Context references; Sprint 4: memory budget documentation (document the expected heap usage for each major screen)
3. The Answer
Leak 1: List<WeakReference<Fragment>> Not Being Pruned
The problem: the "fragment history" list is private val fragmentHistory = mutableListOf<WeakReference<Fragment>>(). When a Fragment is destroyed, its WeakReference in the list becomes dead (.get() returns null) but the WeakReference object itself remains in the list, holding no strong reference to the Fragment. However, the LeakCanary trace shows that the list itself holds the WeakReference objects, and crucially, if any code in the history feature holds the WeakReference objects and calls .get() before the Fragment is fully collected, the Fragment may be temporarily pinned. Additionally, the list is likely in a singleton or application-scoped object. The fix: prune the list of dead references whenever a new Fragment is added:
kotlin
class FragmentHistoryManager {
private val fragmentHistory = mutableListOf<WeakReference<Fragment>>()
fun onFragmentCreated(fragment: Fragment) {
// Prune dead references before adding the new one
fragmentHistory.removeAll { it.get() == null }
fragmentHistory.add(WeakReference(fragment))
}
fun getActiveFragments(): List<Fragment> =
fragmentHistory.mapNotNull { it.get() } // Only returns non-null (living) fragments
}
Additionally: register a FragmentManager.FragmentLifecycleCallbacks that calls fragmentHistory.removeAll { it.get() == fragment || it.get() == null } in onFragmentDestroyed() — proactively removing the dead reference as soon as the Fragment is destroyed rather than waiting for the next onFragmentCreated().
Leak 2: EventBus Subscriber Retained Between Lifecycle Events
The problem: the Fragment subscribes to EventBus in onCreateView() but unsubscribes in onDestroyView(). The Fragment's View lifecycle goes: onCreateView() → onViewCreated() → onDestroyView(). If the Fragment is in the back stack (View is destroyed but Fragment is not), the EventBus retains the Fragment as a subscriber, preventing GC of the Fragment (which holds references to the View hierarchy even after the View is destroyed, if any captured lambdas reference the Fragment fields). The fix: match subscribe and unsubscribe to the same lifecycle event:
kotlin
override fun onViewCreated(view: View, savedInstanceState: Bundle?) {
super.onViewCreated(view, savedInstanceState)
EventBus.getDefault().register(this)
// Subscribe in onViewCreated, unsubscribe in onDestroyView — same lifecycle owner
}
override fun onDestroyView() {
EventBus.getDefault().unregister(this)
super.onDestroyView()
}
The deeper fix: replace the EventBus with a SharedFlow from the ViewModel, collected using viewLifecycleOwner.lifecycleScope:
kotlin
// In Fragment.onViewCreated()
viewLifecycleOwner.lifecycleScope.launch {
viewLifecycleOwner.repeatOnLifecycle(Lifecycle.State.STARTED) {
viewModel.events.collect { event -> handleEvent(event) }
}
}
// No manual cleanup required — the viewLifecycleOwner's scope cancels automatically in onDestroyView
Leak 3: Static Field Holding Activity via Lambda
The LeakCanary trace: static CompanionObject.clickListener → ActivityMainBinding → MainActivity. The problem: a companion object holds a lambda that was created inside an Activity (capturing this) and the lambda was assigned to the static field:
kotlin
// BROKEN — the lambda captures Activity
companion object {
var clickListener: View.OnClickListener? = null // Static field
}
override fun onCreate(savedInstanceState: Bundle?) {
clickListener = View.OnClickListener { /* uses 'this' Activity */ } // Leaked!
}
The fix: never store Activity or View references in static fields or companion object properties. If the listener must be accessible statically, use an interface with a WeakReference wrapper:
kotlin
// If a static reference is truly required
companion object {
private var weakClickListener: WeakReference<View.OnClickListener>? = null
fun setClickListener(listener: View.OnClickListener) {
weakClickListener = WeakReference(listener)
}
fun clearClickListener() {
weakClickListener = null // Called in Activity.onDestroy()
}
}
The simplest fix: remove the static field entirely and use proper component architecture — the click handler belongs in the ViewModel or as a local variable in onCreate(), not in a companion object.
Leak 4: CoroutineScope(Dispatchers.Main) in Fragment
The problem: private val scope = CoroutineScope(Dispatchers.Main) in a Fragment creates a CoroutineScope that is not tied to any Android lifecycle. When the Fragment is destroyed, scope is not cancelled — any running coroutines in scope hold a reference to the Fragment (via captured references in the coroutine lambdas). The fix: use viewLifecycleOwner.lifecycleScope instead of creating a custom scope:
kotlin
// BROKEN — scope outlives the Fragment
class ProfileFragment : Fragment() {
private val scope = CoroutineScope(Dispatchers.Main) // Not cancelled on destroy!
override fun onViewCreated(view: View, savedInstanceState: Bundle?) {
scope.launch { /* captures Fragment references */ }
}
}
// CORRECT — scope is automatically cancelled in onDestroyView
class ProfileFragment : Fragment() {
override fun onViewCreated(view: View, savedInstanceState: Bundle?) {
viewLifecycleOwner.lifecycleScope.launch { /* safe */ }
}
}
If a custom scope is genuinely required (a case where neither lifecycleScope nor viewLifecycleOwner.lifecycleScope are appropriate), add explicit cancellation:
kotlin
private val scope = CoroutineScope(Dispatchers.Main + Job())
override fun onDestroyView() {
scope.cancel() // Explicit cancellation
super.onDestroyView()
}
The Kotlin lint rule FragmentLiveDataObserve in the Android Lint checks detects usage of lifecycleScope where viewLifecycleOwner.lifecycleScope should be used — enable this lint rule in the project's lint.xml to catch this pattern in CI.
Early Warning Metrics:
- LeakCanary retained object count after navigation in UI tests — instrument the Espresso test suite to call
AppWatcher.objectWatcher.expectWeaklyReachable(fragment, "Fragment was retained after navigation")after navigating away from each screen; any object that is not GC'd within 5 seconds triggers a heap dump; a zero-leak UI test suite is the target
- Heap usage P95 monotonic growth per session (Firebase Performance) — the heap usage at session start vs. after 10 minutes of typical use; a growth of more than 30MB over 10 minutes without a corresponding increase in user-visible content indicates an active leak; alert threshold: 50MB growth in 15 minutes
OutOfMemoryErrorrate per device RAM tier (weekly) — segmented by 2GB, 3GB, 4GB+; after fixing all 4 leaks, the rate for 2GB devices should decrease by at least 60%; if not, additional leaks remain
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The List<WeakReference<Fragment>> diagnosis — specifically identifying that the WeakReference object itself stays in the list even after the referent is GC'd, and that the fix is both pruning on add AND a FragmentLifecycleCallbacks for proactive cleanup on destroy — shows that this engineer understands how WeakReference actually works in a Java GC context, not just that it "prevents memory leaks." The Leak 2 fix's pivot to replacing EventBus with viewLifecycleOwner.lifecycleScope + SharedFlow (eliminating the entire class of lifecycle mismatch leaks for that component, not just fixing the current instance) demonstrates the "fix the pattern, not just the instance" architectural thinking. The LeakCanary UI test integration — AppWatcher.objectWatcher.expectWeaklyReachable(fragment, ...) after navigation — is the production-grade CI integration that enforces zero new leaks in every PR rather than discovering leaks in debug testing months later.
What differentiates it from mid-level thinking: A mid-level Android developer would know about WeakReference as a general leak prevention tool but would not know that WeakReferences in a List must be pruned, would fix Leak 2 by unsubscribing in onDestroy() instead of onDestroyView() (correct lifecycle event but unnecessarily late), would know static fields cause leaks but not know about the WeakReference wrapper pattern, and would not know the difference between lifecycleScope and viewLifecycleOwner.lifecycleScope.
What would make it a 10/10: A 10/10 response would include a complete FragmentLifecycleCallbacks implementation for the fragment history pruning, a custom Kotlin Lint rule detecting CoroutineScope(Dispatchers.Main) inside a Fragment (where viewLifecycleOwner.lifecycleScope should be used instead), and a Firebase Performance trace configuration for the heap usage monitoring that automatically segments by device RAM tier.
Question 15: Firebase and Push Notifications — Advanced FCM Integration and Notification Channels
Difficulty: Senior | Role: Android Developer | Level: Senior | Company Examples: Google, WhatsApp, Slack, Twitter, Duolingo
The Question
You are a Senior Android Developer at a messaging app company. The app uses Firebase Cloud Messaging for push notifications, but user complaints and app store reviews reveal 4 problems: (1) Notification sounds and vibration cannot be customised by users because the app uses a single notification channel for all notifications (messages, reactions, system alerts) — Android 8+ allows per-channel customisation in system settings but only if distinct channels are created; (2) On Android 13+ devices, the notification permission dialog is never shown — 60% of new users on Android 13 devices have notifications disabled; (3) When the app receives a data-only FCM message while in the background on Android 8+ devices, the notification is not displayed — the FCM handler calls startService() which fails with an IllegalStateException: App in background on Android 8+; (4) A promotional notification campaign sends the same notification to a user 3 times in 5 minutes because the FCM server occasionally sends duplicate messages — the app does not deduplicate. Walk through the correct implementation for each problem.
1. What Is This Question Testing?
- Notification channels on Android 8+ — knowing that
NotificationChannelwas introduced in Android 8.0 (API 26) as a mandatory grouping mechanism for all notifications; a channel must be created before any notification can be shown; the channel ID, name, and importance level (IMPORTANCE_HIGH for chat messages — plays a sound and shows a heads-up notification, IMPORTANCE_DEFAULT for reactions) determine the notification's default behaviour; users can customise each channel independently in system settings; channels cannot be deleted and recreated to reset user customisations — the channel's user settings are permanent once the user has changed them
- Android 13 notification permission — knowing that Android 13 (API 33) introduced a mandatory runtime permission
POST_NOTIFICATIONSthat must be requested withActivityCompat.requestPermissions()(like camera or location permission); apps that target API 33 must explicitly request this permission; the permission dialog is not shown automatically — the app must callrequestPermissions()at an appropriate moment in the user journey (not on cold start — Android's best practice is to request notification permission in context, e.g., when the user receives their first message or opts in to alerts); knowing that theFirebaseMessaging.getInstance().tokencan be obtained without the notification permission — the permission is only required to display notifications
- Background service restriction on Android 8+ — knowing that Android 8.0 introduced
Context.startForegroundService()for starting services from the background, and that callingstartService()from a background process throwsIllegalStateException; for processing FCM data messages received in the background, the correct approach is either: (a) show the notification directly fromFirebaseMessagingService.onMessageReceived()(for simple notifications), or (b) enqueue aWorkRequest(for complex processing that cannot complete within the FCM delivery timeout);FirebaseMessagingServiceitself runs in the foreground — any work done directly inonMessageReceived()runs in the foreground context and is not subject to the background service restriction
- FCM message deduplication — knowing that FCM's QoS (Quality of Service) can occasionally deliver the same message more than once (at-least-once delivery semantics); the
RemoteMessage.messageIdis a unique identifier per message — storing the last N message IDs in aSet<String>(backed bySharedPreferencesor Room) allows checking whether a message has already been processed; a message with an ID already in the set is a duplicate and should be discarded
FirebaseMessaging.getInstance().subscribeToTopic()— knowing the difference between device tokens (direct messages to a specific device — used for personal notifications) and topic subscriptions (broadcast to all devices subscribed to a topic — used for promotional campaigns); knowing that a promotional notification sent to a topic that multiple server-side systems also send to individually can result in multiple deliveries of the same conceptual notification even with different message IDs — the deduplication must be at the content level (title + body + timestamp within a window) as well as the message ID level
- In-app notification handling — knowing that
FirebaseMessagingService.onMessageReceived()is called only when the app is in the foreground for notification messages (not data-only messages — data messages callonMessageReceived()regardless of foreground/background); when the app is in the foreground, the system does not display the notification automatically — the app must handle it explicitly (show an in-app banner, update the badge count, or display aSnackbar)
2. Framework: FCM and Notification Architecture Model (FNAM)
- Assumption Documentation — Classify all notification types the app sends: personal messages (high importance, user-to-user), group messages (high importance), reactions (default importance), system alerts (default importance), and promotional (low importance); each type needs its own notification channel with the appropriate importance level
- Constraint Analysis — Notification channels cannot be deleted and recreated to change their settings after the user has customised them; plan the channel structure carefully before the first production release; a poorly planned channel structure that requires a "reset" breaks user settings
- Tradeoff Evaluation — Show the notification permission dialog immediately on first launch (maximises the percentage of users who grant permission, but the context is missing — the user doesn't know what the notifications will contain yet) vs. show it when the user first receives a message (contextual, higher grant rate for the users who see it, but delays permission request and some users may never receive a message before closing the app); Android's official recommendation is contextual permission requests after the user understands the value
- Hidden Cost Identification — The notification deduplication store in
SharedPreferencesmust bound its size; if a user receives 1,000 notifications per day, the deduplication store grows indefinitely; use a sliding window: keep only the message IDs from the last 10 minutes (the window in which duplicate delivery typically occurs) — prune older entries on every new message received
- Risk Signals / Early Warning Metrics — Notification permission grant rate by Android API level (segmented in Firebase Analytics:
notif_permission_grantedvs.notif_permission_deniedbyBuild.VERSION.SDK_INT; target above 60% grant rate for API 33+ devices; below 40% indicates the permission request timing or context is suboptimal), duplicate notification rate (lognotif_duplicate_discardedevents and alert if the rate exceeds 0.5% of all FCM messages received in a day — above 0.5% indicates a server-side duplicate sending problem that must be fixed at the FCM sending layer, not just the client deduplication layer)
- Pivot Triggers — If the notification permission grant rate is below 30% for API 33+ devices despite contextual requesting: A/B test the timing and framing of the permission request; specifically, showing a pre-permission rationale screen ("Enable notifications to receive messages in real time — you can customise which notifications you receive") before calling
requestPermissions()improves grant rates by 15–25% per Google's Material Design permission guidance
- Long-Term Evolution Plan — Implement notification grouping (group multiple chat messages from the same conversation under a single notification that expands to show all messages), notification reply from the notification shade (using
RemoteInputto allow users to reply to messages without opening the app), and notification bubbles (the persistent floating chat head UI using the Conversation notification style +BubbleMetadata)
3. The Answer
Fix 1: Multiple Notification Channels
Create a channel per notification category in Application.onCreate():
kotlin
class AppApplication : Application() {
override fun onCreate() {
super.onCreate()
createNotificationChannels()
}
private fun createNotificationChannels() {
if (Build.VERSION.SDK_INT < Build.VERSION_CODES.O) return
val channels = listOf(
NotificationChannel(
CHANNEL_DIRECT_MESSAGES,
getString(R.string.channel_direct_messages_name),
NotificationManager.IMPORTANCE_HIGH
).apply {
description = getString(R.string.channel_direct_messages_description)
enableVibration(true)
setSound(RingtoneManager.getDefaultUri(RingtoneManager.TYPE_NOTIFICATION), null)
},
NotificationChannel(
CHANNEL_REACTIONS,
getString(R.string.channel_reactions_name),
NotificationManager.IMPORTANCE_DEFAULT
).apply {
description = getString(R.string.channel_reactions_description)
enableVibration(false) // Reactions don't need vibration
setSound(null, null) // Reactions are silent by default
},
NotificationChannel(
CHANNEL_SYSTEM,
getString(R.string.channel_system_name),
NotificationManager.IMPORTANCE_LOW
),
NotificationChannel(
CHANNEL_PROMOTIONS,
getString(R.string.channel_promotions_name),
NotificationManager.IMPORTANCE_MIN // Minimal — no sound, no heads-up
)
)
val notificationManager = getSystemService(NotificationManager::class.java)
channels.forEach { notificationManager.createNotificationChannel(it) }
}
companion object {
const val CHANNEL_DIRECT_MESSAGES = "direct_messages_v1"
const val CHANNEL_REACTIONS = "reactions_v1"
const val CHANNEL_SYSTEM = "system_v1"
const val CHANNEL_PROMOTIONS = "promotions_v1"
}
}
The _v1 suffix convention allows creating a new channel in future (_v2) if the settings need to be changed — the old channel becomes obsolete (users who changed it keep their settings; new users get the new defaults on _v2). Use notificationManager.deleteNotificationChannel(CHANNEL_OLD_ID) to remove the obsolete channel from the system settings list.
Fix 2: Android 13 Notification Permission
Request the notification permission in context — specifically when the user receives their first message or enables their first chat room:
kotlin
// In the Activity or Fragment where the user first engages with messaging
private val requestPermissionLauncher = registerForActivityResult(
ActivityResultContracts.RequestPermission()
) { isGranted ->
if (isGranted) {
analyticsManager.log("notif_permission_granted")
} else {
analyticsManager.log("notif_permission_denied")
showInAppNotificationFallback() // Show in-app badges instead
}
}
fun requestNotificationPermissionIfNeeded() {
if (Build.VERSION.SDK_INT < Build.VERSION_CODES.TIRAMISU) return // Not required before API 33
if (ContextCompat.checkSelfPermission(
this, Manifest.permission.POST_NOTIFICATIONS
) == PackageManager.PERMISSION_GRANTED) return // Already granted
if (shouldShowRequestPermissionRationale(Manifest.permission.POST_NOTIFICATIONS)) {
// User previously denied — show rationale first
showNotificationPermissionRationale(onConfirm = {
requestPermissionLauncher.launch(Manifest.permission.POST_NOTIFICATIONS)
})
} else {
// First time — request directly
requestPermissionLauncher.launch(Manifest.permission.POST_NOTIFICATIONS)
}
}
Declare in AndroidManifest.xml:
xml
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
Fix 3: Data-Only FCM Messages in Background — WorkManager Instead of startService()
The broken pattern: onMessageReceived() calls startService(Intent(context, NotificationService::class.java)) which throws IllegalStateException on Android 8+ when the app is in the background. The fix: do the work directly in onMessageReceived() for simple notifications (it runs in the foreground context), or enqueue a WorkRequest for complex processing:
kotlin
class AppFirebaseMessagingService : FirebaseMessagingService() {
override fun onMessageReceived(message: RemoteMessage) {
val data = message.data
when (data["type"]) {
"chat_message" -> {
// Simple notification — do it directly in onMessageReceived (foreground context)
showChatNotification(
senderId = data["senderId"]!!,
text = data["text"]!!,
messageId = message.messageId!!
)
}
"media_download" -> {
// Complex processing — enqueue WorkManager (not startService!)
WorkManager.getInstance(applicationContext).enqueue(
OneTimeWorkRequestBuilder<MediaDownloadWorker>()
.setInputData(workDataOf(
"mediaUrl" to data["mediaUrl"],
"messageId" to message.messageId
))
.build()
)
}
}
}
private fun showChatNotification(senderId: String, text: String, messageId: String) {
// Build and show the notification directly — no Service needed
val notification = NotificationCompat.Builder(this, CHANNEL_DIRECT_MESSAGES)
.setContentTitle("New message from $senderId")
.setContentText(text)
.setSmallIcon(R.drawable.ic_message)
.setAutoCancel(true)
.build()
NotificationManagerCompat.from(this).notify(messageId.hashCode(), notification)
}
}
Fix 4: FCM Duplicate Notification Deduplication
The FirebaseMessagingService.onMessageReceived() receives all FCM messages; deduplication is checked before any processing:
kotlin
class AppFirebaseMessagingService : FirebaseMessagingService() {
private val processedMessageIds by lazy {
getSharedPreferences("fcm_dedup", Context.MODE_PRIVATE)
}
override fun onMessageReceived(message: RemoteMessage) {
val messageId = message.messageId ?: return // Messages without ID cannot be deduplicated
if (isAlreadyProcessed(messageId)) {
Timber.d("Duplicate FCM message $messageId — discarding")
analyticsManager.log("notif_duplicate_discarded")
return
}
markAsProcessed(messageId)
// ... process the message
}
private fun isAlreadyProcessed(messageId: String): Boolean =
processedMessageIds.contains(messageId)
private fun markAsProcessed(messageId: String) {
processedMessageIds.edit {
// Store the message ID with the current timestamp
putLong(messageId, System.currentTimeMillis())
// Prune entries older than 10 minutes (the duplicate delivery window)
val cutoff = System.currentTimeMillis() - TEN_MINUTES_MS
processedMessageIds.all.entries
.filter { it.value as Long < cutoff }
.forEach { remove(it.key) }
}
}
companion object {
private const val TEN_MINUTES_MS = 10 * 60 * 1000L
}
}
The sliding window (prune entries older than 10 minutes) bounds the SharedPreferences size to at most the number of messages received in a 10-minute window — for a busy messaging app, this is a reasonable bound (a user receiving 100 messages in 10 minutes stores 100 entries × approximately 50 bytes each = approximately 5KB; acceptable for SharedPreferences).
Early Warning Metrics:
notif_permission_granted/notif_permission_deniedratio by API level and user cohort — segmented by whether the user was shown the rationale screen first; the cohort shown the rationale should have a 15–25% higher grant rate; if not, the rationale message is not compelling
- Notification display rate (sent FCM → notification shown) — Firebase Analytics log
notif_showninshowChatNotification()and compare against the FCM delivery report in the Firebase Console; a delivery-to-display gap above 5% indicates a delivery failure (permission denied, notification channel misconfiguration, or theIllegalStateExceptionbug recurring)
- Duplicate message discard rate daily — the
notif_duplicate_discardedevent rate as a percentage of total FCM messages received; above 1% indicates a server-side sending bug that must be fixed at the FCM sending layer; the client-side deduplication is a safety net, not a substitute for correct server behaviour
4. Interview Score: 9.5 / 10
Why this demonstrates senior-level maturity: The notification channel _v1 suffix convention — with the explicit explanation that channels cannot have their default settings reset once a user has customised them, and that _v2 allows new users to get new defaults while preserving old users' settings — is the production notification engineering detail that app teams discover the hard way after their first channel misconfiguration reaches production. The 10-minute sliding window for the FCM deduplication store — with the specific size calculation (100 messages × 50 bytes = 5KB, well within SharedPreferences limits) — is the bounding analysis that prevents the deduplication store from becoming a memory problem on high-volume accounts. The onMessageReceived() showing the notification directly (not via startService()) with the explicit explanation that FirebaseMessagingService itself runs in the foreground context (immune to the background service restriction) is the precise API understanding.
What differentiates it from mid-level thinking: A mid-level Android developer would know about notification channels and FCM but would not know about the _v1 channel naming convention for future-proofing, would fix the background service crash by switching to startForegroundService() (which still requires manual foreground service management) rather than processing directly in onMessageReceived(), would not know about the POST_NOTIFICATIONS permission's shouldShowRequestPermissionRationale() pre-dialog pattern for API 33+, and would not design the sliding window for the deduplication store.
What would make it a 10/10: A 10/10 response would include a NotificationChannelGroup configuration grouping the 4 channels under a single "Notifications" group in system settings, a Conversation notification style implementation for the direct messages channel (using Person API and MessagingStyle for Android 10+ rich messaging notifications), and a FirebaseMessagingService integration test using FirebaseTestLab or Robolectric's FCM simulation to verify all 4 fixes in CI.