Microservice Architecture and CI/CD in monolithic Git repository

When talking about Git repositories there are two main approaches:

1. Monolithic Git repository

2. Multi-repository

What is monolithic Git repository?

A monolithic Git repository (monorepo) refers to a version control repository that contains the entire source code and assets of a software project within a single repository. In other words, all components, modules, libraries, and other resources of the project are stored together in a unified structure.

In contrast, a distributed or modular approach, where a project might be split into multiple separate repositories, each containing a specific component or module. Monolithic repositories are commonly used in software development, especially in the early stages of a project or in smaller organizations, as they offer several benefits:

1.Simplicity: With all code in one repository, it's often easier to manage and coordinate changes across different parts of the project.

2.Visibility: Developers have a broader view of the entire project, making it easier to understand how different components interact.

3.Simplified Dependency Management: Since all components are together, there's no need to manage complex dependency relationships between different repositories.

4.Consistent Builds: Building the entire project becomes straightforward, as all required components are available in one place.

5.Easier Refactoring: Refactoring code that spans multiple components can be simpler when everything is in one repository.

However, as a project grows larger and more complex, there are also downsides to using a monolithic repository:

1.Scalability: As the project scales, the repository can become quite large, making cloning, branching, and pulling updates more time-consuming.

2.Isolation and Autonomy: It might become harder for different teams to work independently on specific components, as they all share the same repository.

3.Risk of Conflicts: With many developers contributing to a large repository, the potential for merge conflicts increases.

4.Build Times: Building the entire project can become slower, especially if only a small part of the code has changed.

5.Flexibility: It might be challenging to adopt different development and deployment workflows for different components.

What is multi-repository approach?

A multi-repository (multi-repo) approach involves managing different components, modules, or services of a software project in separate Git repositories rather than keeping them all together in a single monolithic repository. Each repository focuses on a specific part of the project, allowing for more granular control, independent development, and easier maintenance.

Here are some key aspects and benefits of a multi-repository approach:

1.Isolation and Independence: Each repository is dedicated to a particular component or service. This isolation enables different teams or developers to work on their respective parts without interfering with others. Changes made in one repository won't directly impact the others.

2.Independent Versioning: Each repository can have its own versioning and release cycle. This is particularly useful when different components evolve at different rates or are used across multiple projects.

3.Focused Codebase: Developers working in a specific repository can focus on the codebase relevant to their area of expertise. This can lead to improved code quality and better maintainability.

4.Simplified Dependency Management: Instead of dealing with all project dependencies in a monolithic repository, each repository can manage its dependencies separately. This can lead to more consistent and reliable dependency management.

5.Reduced Clutter: A multi-repository approach can prevent a single repository from becoming too large and unwieldy, which can slow down operations like cloning, branching, and merging.

6.Parallel Development: Teams can work on different repositories concurrently, promoting parallel development of various parts of the project.

7.Flexibility in Technology Stacks: Different components of a project might require different programming languages, frameworks, or tools. In a multi-repository setup, you have the flexibility to choose the best technology for each component.

8.Decoupled Deployment: If components of your project are deployed independently, having separate repositories can streamline the deployment process and reduce the risk of impacting other components.

However, a multi-repository approach also presents challenges and considerations:

1.Dependency Management: While independent repositories simplify dependency management within each repository, you need a strategy to manage dependencies between repositories. This might involve using package managers, Git submodules, or other tools.

2.Communication and Collaboration: Efficient communication and collaboration become crucial, especially when changes in one repository might affect others. Clear communication channels and shared guidelines are essential.

3.Integration Testing: Integration testing becomes important to ensure that changes in one repository don't break functionality in others. Automated testing and continuous integration practices help address this challenge.

4.Project-Wide Changes: Implementing changes that span multiple repositories, such as refactoring or introducing common patterns, can be more complex.

5.Learning Curve: Teams might need to adapt to new tools and practices to effectively manage multiple repositories.

Microservice Architecture and code repository approach

Microservices architecture is an approach to designing and building software applications as a collection of small, independent, and loosely coupled services. Each service in a microservices architecture represents a distinct functional unit that operates as a separate process or container and communicates with other services over a network, often using APIs (Application Programming Interfaces).

Key characteristics of microservice architecture with respect to code repository and CI/CD:

1.Each service has its own repository, allowing different teams to work on different services independently.

2.Each service operates independently and can be developed, deployed, and scaled individually. Changes to one service do not necessarily impact others.

3.Development teams can work on different services concurrently, using different technologies and programming languages if needed.

4.Services can be deployed and updated independently, allowing for faster and more frequent releases.

5.Each service team is responsible for the entire lifecycle of their service, including development, testing, deployment, and monitoring.

From the above, is seems that a better repository approach for a microservice architecture is a multi-repository approach.

Why we chose monolithic Git repository for microservice architecture?

The decision between using a monolithic or a multi-repository structure depends on various factors, including the project size, the number of developers working on it, the complexity of the components, and organization's development workflow. Both approaches have their pros and cons.

The main reasons for choosing monolithic over multi-repository are:

1.Team size: Since we work in small teams without dedicating one service per team, a monolithic repository is a better choice because there is no need to switch between repos to complete regular day-to-day work.

2.Same technology stack for all services: Even if this conflicts with microservice architecture due to the team structure and size, all our services are written using the same language and technology stack. Monolithic repository also simplifies enforcing the same code policy across all services and libraries.

3.Easier Integration Testing: Within single repository it is more difficult to break adjacent functionality. We are setting up all tests for all components to run whenever any single component is modified. As a result, the likelihood of doing a change in some components has minimized adverse effects on other libraries.

4.Centrally located code management: In a single repository gives visibility of all the code to all developers. It simplifies code management since we can use a single-issue tracker to watch all issues throughout the application’s life cycle.

5.Painless Application-Wide Refactoring: When creating an application-wide refactoring of the code, multiple libraries will be affected. If you’re hosting them via multiple repositories, managing all the different pull requests to keep them synchronized with each other can prove to be a challenge. A monolithic repository makes it easy to perform all modifications to all code for all libraries and submit it under a single pull request.

Monolithic repository with Code Organization

Even in a single repository, we still need to manage complexity. We organized our codebase within the repository in a modular manner using folders and namespaces to separate different components and enforce clear guidelines on how to structure the codebase.

Continuous Integration Pipeline

Since we are working in a trunk-based development manner, we are relying on short living branches and high level of automation. In ideal case each developer creates one pull request daily. In this way, we minimize merge conflicts by avoiding branches to be outside the main one for a long time. As mentioned above with growing of codebase building the entire project can become slower. Because our PR pipeline is set up to run all check even if some small part of one component is changed, this pipeline can become bottle neck. To overcome this, we organized PR pipeline into multiple stages that can be executed in parallel. With that in place building time is reduced a lot.

Continuous Deployment Pipelines

As it is mentioned above, using microservice architecture should enable independent service deployments. However, in a single repository it becomes challenging to have this in place. Since we are deploying our services to Kubernetes, we are storing Kubernetes configuration files for each service into its own Git repository. For deployment we are utilizing Argo CD – an open-source declarative continuous delivery tool specifically designed for Kubernetes applications. Argo CD continuously monitors the Git repositories where our configuration files are stored. When changes are detected, it automatically applies those changes to the Kubernetes clusters, ensuring that the actual cluster state matches the desired state in the Git repository. Thanks to monolithic repository organization mentioned above, we can clearly distinguish which service is affected by code coming to main branch. When affected service is detected (or services, since change can affect more then one service) release pipeline in charge of it is triggered. That pipeline doesn’t perform actual service deployment but instead it updates Kubernetes configuration files used for service deployment (it will always change image tag at least). From here Argo CD takes on, applying these changes to the Kubernetes cluster.

Tietoevry FinTech

Microservice Architecture and CI/CD in monolithic Git repository

Galerija