Clean Architecture Guide (with tested examples): Data Flow != Dependency Rule

Mario Sanoguera de Lorenzo
ProAndroidDev
Published in
5 min readJul 26, 2018

--

Difference between Data Flow & Dependency Rule

Hello everyone! šŸ‘‹ In this story I want to explain Clean Architecture (with tested examples) & talk about the most common mistake when adopting it (the difference between Data Flow and Dependency Rule).

This story is the follow up after my latest medium Intro to App Architecture. Make sure to check it out if you are interested in what makes a good Architecture & why + how to achieve a more maintainable codebase.

Data Flow

Letā€™s start explaining Data Flow in Clean Architecture with an example.

Imagine opening an app that loads a list of posts which contains additional user information. The Data Flow would be:

1. UI calls method from Presenter/ViewModel.

2. Presenter/ViewModel executes Use case.

3. Use case combines data from User and Post Repositories.

4. Each Repository returns data from a Data Source (Cached or Remote).

5. Information flows back to the UI where we display the list of posts.

From the example above we can see how the user action flows from the UI all the way up to the Data Source and then flows back down. This Data Flow is not the same flow as the Dependency Rule.

Dependency Rule

Dependency Rule is the relationship that exists between the different layers. Before explaining the Dependency Rule in Clean Architecture lets rotate the onion 90 degrees. This helps to point out layers & boundaries. šŸ†’

Clean Architecture Layers

Letā€™s identify the different layers & boundaries.

Presentation Layer contains UI (Activities & Fragments) that are coordinated by Presenters/ViewModels which execute 1 or multiple Use cases. Presentation Layer depends on Domain Layer.

Domain Layer is the most INNER part of the onion (no dependencies with other layers) and it contains Entities, Use cases & Repository Interfaces. Use cases combine data from 1 or multiple Repository Interfaces.

Data Layer contains Repository Implementations and 1 or multiple Data Sources. Repositories are responsible to coordinate data from the different Data Sources. Data Layer depends on Domain Layer.

Explaining the Domain Layer

Domain (with business rules) is the most important Layer.

Domain is at the center of the onion which means it is the core of our program. This is one of the main reasons why it shouldnā€™t have any dependencies with other layers.

Presentation and Data Layers are less important since they are only implementations that can be easily replaced. The list of posts could be displayed in Android, iOS, Web or even Terminal if your code is properly decoupled. The same happens with a Database or any kind of Data Source, it can be easily switched.

The outer you go on the onion the most likely things are prone to change. One of the most common mistakes is to have your app driven by your data layer/specific data system. Making it hard to replace or bridge with different data sources down the line.

Domain Layer does NOT depend on Data Layer.

Having modules with the correct dependency rules means that our Domain doesnā€™t have any dependency on any other layer. Due to no dependencies to any Android Library the Domain Layer should be a Kotlin Module. This is an extra boundary that will prevent polluting our most valuable layer with framework related classes. It also promotes reusability across platforms in case we switch over the Framework as our Domain Layer is completely agnostic.

Dependencies between Layers

Explaining the Domain Layer with an example!

What is the real problem that we need to solve?

ā€œLoad a list of posts with some user information for each postā€

This is the core of our solution no matter where the data comes from or how we present it. This belongs to a Use case inside our Domain Layer which is the most inner layer of the architecture (business logic).

For those who havenā€™t tried Clean Architecture yet Use cases will avoid God Presenters/ViewModels since the Presentation Layer will only execute Use cases and notify the view (Separation of concerns + Single Responsibility Principle). This will also improve the RUDT points (Read, Update, Debug & Test) of your project.

Tests here (omitted to make the story shorter).

This Use case is combining data from 2 repositories (UserRepository & PostRepository).

How does Domain NOT depend on Data?

This is because Use cases in Domain are not using the actual implementation of the Repository that sits in the Data Layer. Instead, it is just using an abstraction/interface in the Domain Layer that acts as a contract for any repository who wants to provide the data.

In Clean Architecture it is the responsibility of the Data Layer to have 1 or multiple implementations of the Domainā€™s interfaces and to bind the interface with the actual implementation.

Dagger2 binding example:

Koin binding example:

This abstraction with the interface and its binding is the Dependency Inversion principle (D from SOLID) which is a way of decoupling modules.

High-level modules should not depend on low-level modules, both should depend on abstractions.

In simple terms, this means adding/depending on interfaces so we can easily switch the implementation and decouple our software modules.

Explaining the Data Layer: Repositories Implementation & Data Sources

The Repository Implementation implements the Domain Repository Interface and is in charge of combining 1 or multiple Data Sources. The most common ones are Memory, Cache & Remote.

Tests here (omitted to make the story shorter).

This Repository is pulling from the cache and remote interfaces (each Data Source then binds to an implementation). This decoupling makes the Repository agnostic from its Data Sources, avoiding changes when switching a Data Source implementation.

Repositories expect from Data Sources the Domain Models already so it pushes the responsibility of mapping from Data to Domain to each individual Data Source implementation (or from Domain to Data in case you are pushing something to the Data Source).

Data Strategies.

Multiple Data Sources lead to different Data Strategies. My favorite is to only return Cache (unique source of truth) and to refresh Cache from Remote only when it is empty or there is a user action (swipe to refresh for ex.). This saves lots of data and was inspired after reading Build for the next billion users from Android Developers.

Final recap:

Iā€™ve omitted the Presentation Layer as it only needs to execute use cases & display data. Remember that each layer has its own entities & mappers and that in order to keep our Domain Layer with no dependencies Data and Presentation are responsible to map to/from Domain Entities depending if you are pushing or pulling data.

Done! šŸ‘ šŸ‘ šŸ‘

You can check my sample app that was used as an ex. by clicking up here ā˜ļø

Credits:

ā†’ My friend Igor Wojda šŸ‘‹šŸ¤–ā€™s Clean Architecture slides and talk.

ā†’ Issue asking why Domain isnā€™t depending on the Data layer.

Remember to follow, share & hit the šŸ‘ button if youā€™ve liked it! (:

GitHub | LinkedIn | Twitter

--

--