Skip to main content

66 posts tagged with "architecture"

View all tags

Introduction to Architecture

· 3 min read

What is Architecture?

Architecture is the shape of a software system. To illustrate with a building:

  • Paradigm is the bricks.
  • Design principles are the rooms.
  • Components are the structure.

They all serve a specific purpose, just like hospitals treat patients and schools educate students.

Why Do We Need Architecture?

Behavior vs. Structure

Every software system provides two distinct values to stakeholders: behavior and structure. Software developers must ensure that both values are high.

==Due to the nature of their work, software architects focus more on the structure of the system rather than its features and functions.==

Ultimate Goal — ==Reduce the human resource costs required for adding new features==

Architecture serves the entire lifecycle of software systems, making them easy to understand, develop, test, deploy, and operate. Its goal is to minimize the human resource costs for each business use case.

O'Reilly's "Software Architecture" provides a great introduction to these five fundamental architectures.

1. Layered Architecture

Layered architecture is widely adopted and well-known among developers. Therefore, it is the de facto standard at the application level. If you are unsure which architecture to use, layered architecture is a good choice.

Examples:

  • TCP/IP model: Application Layer > Transport Layer > Internet Layer > Network Interface Layer
  • Facebook TAO: Network Layer > Cache Layer (follower + leader) > Database Layer

Pros and Cons:

  • Pros
    • Easy to use
    • Clear responsibilities
    • Testability
  • Cons
    • Large and rigid
      • Adjusting, extending, or updating the architecture requires changes across all layers, which can be quite tricky.

2. Event-Driven Architecture

Any change in state triggers an event in the system. Communication between system components is accomplished through events.

A simplified architecture includes a mediator, event queue, and channels. The diagram below illustrates a simplified event-driven architecture:

Examples:

  • QT: Signals and Slots
  • Payment infrastructure: As bank gateways often have high latency, asynchronous techniques are used in banking architecture.

3. Microkernel Architecture (aka Plug-in Architecture)

The functionality of the software is distributed between a core and multiple plugins. The core contains only the most basic functionalities. Each plugin operates independently and implements shared interfaces to achieve different goals.

Examples:

  • Visual Studio Code and Eclipse
  • MINIX operating system

4. Microservices Architecture

Large systems are decomposed into numerous microservices, each a separately deployable unit that communicates via RPCs.

uber architecture

Examples:

5. Space-Based Architecture

The name "Space-Based Architecture" comes from "tuple space," which implies a "distributed shared space." In space-based architecture, there are no databases or synchronized database access, thus avoiding database bottleneck issues. All processing units share copies of application data in memory. These processing units can be flexibly started and stopped.

Example: See Wikipedia

  • Primarily adopted by Java-based architectures: for example, JavaSpaces.

Stream and Batch Processing Frameworks

· 2 min read

Why Do We Need Such Frameworks?

  • To process more data in a shorter amount of time.
  • To unify fault tolerance in distributed systems.
  • To simplify task abstractions to meet changing business requirements.
  • Suitable for bounded datasets (batch processing) and unbounded datasets (stream processing).

Brief History of Batch and Stream Processing Development

  1. Hadoop and MapReduce. Google made batch processing as simple as MapReduce result = pairs.map((pair) => (morePairs)).reduce(somePairs => lessPairs) in a distributed system.
  2. Apache Storm and directed graph topologies. MapReduce does not represent iterative algorithms well. Therefore, Nathan Marz abstracted stream processing into a graph structure composed of spouts and bolts.
  3. Spark in-memory computation. Reynold Xin pointed out that Spark uses ten times fewer machines than Hadoop while being three times faster when processing the same data.
  4. Google Dataflow based on Millwheel and FlumeJava. Google uses a windowed API to support both batch and stream processing simultaneously.
  1. Flink quickly adopted the programming model of ==Google Dataflow== and Apache Beam.
  2. Flink's efficient implementation of the Chandy-Lamport checkpointing algorithm.

These Frameworks

Architecture Choices

To meet the above demands with commercial machines, there are several popular distributed system architectures...

  • Master-slave (centralized): Apache Storm + Zookeeper, Apache Samza + YARN
  • P2P (decentralized): Apache S4

Features

  1. DAG Topology for iterative processing - for example, GraphX in Spark, topologies in Apache Storm, DataStream API in Flink.
  2. Delivery Guarantees. How to ensure the reliability of data delivery between nodes? At least once / at most once / exactly once.
  3. Fault Tolerance. Implement fault tolerance using cold/warm/hot standby, checkpointing, or active-active.
  4. Windowed API for unbounded datasets. For example, streaming windows in Apache. Window functions in Spark. Windowing in Apache Beam.

Comparison Table of Different Architectures

ArchitectureStormStorm-tridentSparkFlink
ModelNativeMicro-batchMicro-batchNative
GuaranteesAt least onceExactly onceExactly onceExactly once
Fault ToleranceRecord AckRecord AckCheckpointCheckpoint
Maximum Fault ToleranceHighMediumMediumLow
LatencyVery lowHighHighLow
ThroughputLowMediumHighHigh

A Closer Look at iOS Architecture Patterns

· 3 min read

Why Should We Care About Architecture?

The answer is: to reduce the human resources spent on each feature.

Mobile developers evaluate the quality of an architecture on three levels:

  1. Whether the responsibilities of different features are evenly distributed
  2. Whether it is easy to test
  3. Whether it is easy to use and maintain
Responsibility DistributionTestabilityUsability
Tight Coupling MVC
Cocoa MVC❌ V and C are coupled✅⭐
MVP✅ Independent view lifecycleAverage: more code
MVVMAverage: View has a dependency on UIKitAverage
VIPER✅⭐️✅⭐️

Tight Coupling MVC

Traditional MVC

For example, in a multi-page web application, when you click a link to navigate to another page, the entire page reloads. The problem with this architecture is that the View is tightly coupled with the Controller and Model.

Cocoa MVC

Cocoa MVC is the architecture recommended by Apple for iOS developers. Theoretically, this architecture allows the Controller to decouple the Model from the View.

Cocoa MVC

However, in practice, Cocoa MVC encourages the use of massive view controllers, ultimately leading to the view controller handling all operations.

Realistic Cocoa MVC

Although testing such tightly coupled massive view controllers is quite difficult, Cocoa MVC performs the best in terms of development speed among the existing options.

MVP

In MVP, the Presenter has no relationship with the lifecycle of the view controller, allowing the view to be easily replaced. We can think of UIViewController as the View.

Variant of MVC

There is another type of MVP: MVP with data binding. As shown in the figure, the View is tightly coupled with the Model and Controller.

MVP

MVVM

MVVM is similar to MVP, but MVVM binds the View to the View Model.

MVVM

VIPER

Unlike the three-layer structure of MV(X), VIPER has a five-layer structure (VIPER View, Interactor, Presenter, Entity, and Routing). This structure allows for good responsibility distribution but has poorer maintainability.

VIPER

Compared to MV(X), VIPER has the following differences:

  1. The logic processing of the Model is transferred to the Interactor, so Entities have no logic and are purely data storage structures.
  2. ==UI-related business logic is handled in the Presenter, while data modification functions are handled in the Interactor==.
  3. VIPER introduces a routing module, Router, to implement inter-module navigation.

Lambda Architecture

· 2 min read

Why Use Lambda Architecture?

To address the three issues brought by big data:

  1. Accuracy (good)
  2. Latency (fast)
  3. Throughput (high)

For example: The problems of scaling web browsing data records in a traditional way:

  1. First, use a traditional relational database.
  2. Then, add a "publish/subscribe" model queue.
  3. Next, scale through horizontal partitioning or sharding.
  4. Fault tolerance issues begin to arise.
  5. Data corruption phenomena start to appear.

The key issue is that in the AKF Scaling Cube, ==having only the X-axis for horizontal partitioning of one dimension is not enough; we also need to introduce the Y-axis for functional decomposition. The lambda architecture can guide us on how to scale a data system==.

What is Lambda Architecture?

If we define a data system in the following form:

Query=function(all data)

Then a lambda architecture is:

Lambda Architecture

batch view = function(all data at the batching job's execution time)
realtime view = function(realtime view, new data)

query = function(batch view, realtime view)

==Lambda architecture = Read/Write separation (Batch Processing Layer + Service Layer) + Real-time Processing Layer==

Lambda Architecture for big data systems

Thinking Software Architecture as Physical Buildings

· One min read

What is architecture?

Architecture is the shape of the software system. Thinking it as a big picture of physical buildings.

  • paradigms are bricks.
  • design principles are rooms.
  • components are buildings.

Together they serve a specific purpose, like a hospital is for curing patients and a school is for educating students.

Why do we need architecture?

Behavior vs. Structure

Every software system provides two different values to the stakeholders: behavior and structure. Software developers are responsible for ensuring that both those values remain high.

::Software architects are, by virtue of their job description, more focused on the structure of the system than on its features and functions.::

Ultimate Goal - ==saving human resources costs per feature==

Architecture serves the full lifecycle of the software system to make it easy to understand, develop, test, deploy, and operate. The goal is to minimize the human resources costs per business use-case.

Designing very large (JavaScript) applications

· 3 min read

Very Large JS App = a lot of developers + large codebase

How to deal with a lot of developers?

empathy

What is a ==senior engineer==? A team of senior engineers without junior engineers is a team of engineers

  1. being senior means is that I’d be able to solve almost every problem that somebody might throw at me.
  2. make the junior engineers eventually be senior engineers.

what’s the next step of a senior engineer?

  1. senior: “I know how I would solve the problem” and because I know how I would solve it I could also teach someone else to do it.
  2. next level: “I know how others would solve the problem “

good programming model

how people write software, e.g. react/redux, npm. Here comes a model that affects all large JS apps - code splitting.

  1. People have to think what to bundle and when to load
  2. ==route based code splitting==
  3. But, what if it is not enough?
    1. lazy loaded every single component of our website
    2. How does Google do? split them by rendering logic, and by application logic. ==simply server side render a page, and then whatever was actually rendered, triggers downloading the associated application bundles.== Google does not do isomorphic rendering - no double rendering

How to deal with a large codebase?

==Code Removability/Delete-ability==

e.g. CSS is bad in code removability

  1. one big css file. There is this selector in there. Who really knows whether that still matches anything in your app? So, you end up just keeping it there.
  2. People thus created CSS-in-JS

==avoid central configuration of your application at all cost==

  1. Bad example
    1. central routes configuration
    2. central webpack.config.js
  2. Good example
    1. decentralized package.json

avoid central import problem: router imports component A, B, and C

  1. to solve this problem, do ==“enhance” instead of “import”==
  2. However, developers still have to decide when to enhance and when to import. Since this might lead to very bad situations, we make “enhance” illegal, nobody gets to use it–with one exception: generated code.

avoid base bundle pile of trash

  1. e.g. base bundle should never contain UI code
  2. Solve this problem with forbidden dependency tests
  3. ==most straight forward way must be the right way; otherwise add a test that ensures the right way.==

Be careful with abstractions

We have to become good at finding the right abstractions: Empathy and experience -> Right abstractions